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KWO^ft] 1 Title of Invention 



Method to utilize 5' ends of 



transcribed regions for cloning and analysis 

[11^*11 1 1 A method to prepare nucleic acids tags corresponding t 



[fSNt?Il] A method according to claim 1 where concatamers of such 
5* end tags are produced. 

[|f^^3] A method in which such 5' end specific sequence tags d 
erived from transcribed regions said mRNA are analyzed by sequencing. 

[flNt*S4] The method for preparing concatemers of a plurality of 
at least two or more nucleic acid fragments having information on nucle 
otide sequences of 5* end regions of a plurality of nucleic acids relate 
d to transcribed in a sample, comprising 

A first step of selectively collecting a plurality of cDNAs containing r 
eg ions complementary to 5' -end regions of mRNAs, which cDNAs are formed 
by using RNA or mRNAs derived from a biological sample or in vitro synth 
esized RNA derived from cDNA - or tag - libraries in the sample as temp 
lates; 

A second step to collecting fragments containing cDNA regions including 
at least the regions corresponding to the 5' -end regions of said mRNAs o 
r cDNA; 

And a third step of creating a concatamer of such 5' end nucleic tags. 

[tWNtJlS] The method as in claim 4 but in which 
The first step is substituting the cap-structure of mRNAs with an oligon 
ucleotide; 

The second step constitute in the formation of full-length cDNA; 

The third step involve cleavage of a 5' end tag and formation of concata 

mers. 



o the 5' end of transcribed regions said mRNA. 
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[S^S6] The method according to claim 4, wherein said first st 
ep comprises the steps of synthesizing the first-strand cDNAs using mRNA 
s as templates; attaching a selective binding substance to the cap struc 
tures of said mRNAs; cleaving single-stranded KNAs; binding said select i 
ve binding substance to a corresponding selective binding substance immo 
bilized on a support, which corresponding selective binding substance se 
lectively binds to said selective binding substance; and recovering said 
cDNA. 

[It^c^7] A method as in claim 4 where the first step to isolate 
the full-length cDNA includes an RNase digestion step followed by treat 
ment with an immobilized cap-binding substance followed by eluting such 
full-length cDNAs. 

[Iff ^31 8 ] A method to add a sequence connected to the 5' end a 
nucleic acids corresponding to the 5' terminal part of a transcript, whe 
n such that can be recognized by a substance that is capable of cleaving 
such nucleic acids outside the recognition sequence, 

[H^^9] The method according to claim 4, wherein said selectiv 
e binding substance is biotin, and said corresponding selective binding 
substance is avidin, streptavidin or an avidin or streptavidin derivativ 
e which specifically binds to biotin. 

[ff^*Sl 0] The method according to the claim 4 where the select 
ive binding substance is digoxigenin and said corresponding binding subs 
tance is an antibody directed against digoxigenin. 

[If^^l 1 1 A method according to claim 4 or 9, wherein a select 
ive binding substance is bound to a corresponding selective binding subs 
tance which is immobilized on to a support, and where such a support is 
made of magnetic beads, agarose beads, or latex beads 

MfJjfc© 12] The method according to any one of claims 4 and 6 to 
11, wherein said second step comprises the steps of binding a linker ha 
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ving at least a restriction site for a substance that cleaves DNA outsid 
e its recognition sequence in the end region corresponding to the 5* end 

of said nucleic acids corresponding to the 5' end of genes, and a rando 
m oligomer region at the 3' end region; synthesizing a second-strand cDN 
A using said linker or other oligonucleotides partially or totally corre 
sponding to the linker as a primer and said cDNA as a template; treating 

the obtained linker-bound double-stranded cDNA with said restriction en 
zyme; and selectively recovering fragments yielded by cleavage by the re 
strict ion enzyme, which fragments contain said linker moieties and part 
of 5' end cDNA. 

[fflf^fU! 13] The method according to any of claims 4 to 12, where 
in a selective binding substance is attached to said linker; and the ste 
p of selectively recovering said fragments containing said linker moieti 
es comprises the steps of binding said selective binding substance to a 
corresponding selective binding substance immobilized on a support, whic 
h corresponding selective binding substance selectively binds to said se 
lective binding substance; and recovering said support. 

[fff^tSI 14] The method according to any of claims 4 to 13, where 
in said selective binding substance is biotin, and said corresponding se 
lective binding substance is avidin, streptavidin, or an avidin derivati 
ve or derivatives of streptavidin which specifically binds to biotin. 

15] The method according to any of claims 4 to 13 where 
the selective binding substance is digoxigenin and said corresponding bi 
nding substance is an antibody directed against digoxigenin. 

[fff*}t9l 16] The method according to any one of claims 4 to 15, w 
herein said restriction enzyme is a substance with a enzymatic activity 
to recognize nucleic acid and to cleave at a site different form the rec 
ognition site. 

IffiNtfll 1 7] Hie method according to any one of claims 4 to 16, w 
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herein said restriction enzyme is a class II restriction enzyme like Gs 
u I , Mmel, Bpm I or Bsg I 

[ff^3I1.8] The method using nucleic acid fragments obtained acc 
ording to any one of claims 1 to 18, for further comprising the steps of 
cloning into concatamer. 

[ft^M 1 9] A method for determining nucleotide sequences of 5'- 
end regions of a plurality of mRNAs by sequencing said concatemer prepar 
ed by the method according to any one of claims 1 to 18. 

Im^MZ 0] A method, which is the same method according to any 
one of claim 1 to 18, except that preliminarily obtained cDNAs having co 
mplete length is used instead of carrying out said first step. 

[fPMtJl2 1] A method to produce 5' end nucleic acids tags corres 
ponding to the 5* ends of mRNA, in which a mixture of RNA molecules is p 
repared from a preexisting full-length cDNA library and the obtained RNA 
carries at the 5' end of the RNAs a sequence cleavable by a substance a 
ble to recognize a nucleic acids and cleave outside its recognition sequ 
ence. 

lm$&M>2 2] A method to produce 5' end nucleic acids tags corres 
ponding to the 5' ends of mRNA, in which a mixture of nucleic acids TAG 
molecules is prepared from a preexisting full-length cDNA library carryi 
ng close to the 5' end of a sequence cleavable by a substance able to re 
cognize a nucleic acids and cleave outside its recognition sequence, whi 
ch is used to produce a nucleic acid TAG molecule. 

Im 3c*! 2 3] Hie concatemer prepared by the method according to a 
ny one of claims 1 to 22. 

Im 4] A vector comprising said concatemer according to cla 
im 23. 

[IS^^2 5] A sequence, which is derived from a concatemer prepa 
red by the method according to any one of claims 1 to 22. 
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[If 6] A method based on any of claims 1 to 22, which allow 
s to determine the transcriptional status of a given cell and therefore 
the transcriptional networking. 

[frf3fc!K2 7] A method, which is the same method according to any 
one of claims 1 to 22 to obtain expression data on a plurality of mRNA o 
r cDNA in a sample. 

[ffl^532 8] A method, which is the same method according to any 
one of claims 1 to 22, to quantify expression data on a plurality of mRN 
A in a sample. 

[ff^t9l2 9] A method, which uses sequence information obtained f 
rom concatemers prepared by a method according to any one of claims 1 to 
22, is used to build a database holding sequence information derived fr 
om the concatemers. 

[If 0] A method, which is the same method according to any 

one of claims 1 to 22, to identify open reading frames in a genomic sequ 
ence said genome. 

[If 1] A method, which is the same method according to any 

one of claims 1 to 22, to identify start sites of transcription and regu 
latory sequences upstream of the start site of transcription in a genomi 
c sequence said genome. 

[fjfjJcJif 3 2] A method, which uses sequence information obtained f 
rom concatemers prepared by a method according to any one of claims 1 to 
22, to clone a full-length or partial cDNA from a plurality of nucleic 
acids. 

[ff;Jt!|C3 3] A method, which uses sequence information obtained f 
rom concatemers prepared by a method according to any one of claims 1 to 
22, to analyze the activity of regulatory regions in a genome said prom 
oter. 

HSNc!g3 4] A method, which uses sequence information obtained f 
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rom concatemers prepared by a method according to any one of claims 1 to 
22, to inactivate a gene. 

[If 5] A method, which uses sequence information obtained f 
rom concatemers prepared by a method according to any one of the claims 
1 to 22, to synthesis nucleotide sequences said linker, 

[|f 6] A method, which uses sequence information obtained f 
rom concatemers prepared by a method according to any one of the claims 
1 to 22, to synthesize nucleotide sequences said primers. 

MWfc!§3 7] A method, which uses sequence information obtained f 
rom concatemers prepared by a method according to any one of the claims 
1 to 22, to obtain extended nucleotide sequences derived from the 5' -end 
s of transcripts said sequencing. 

[If 8] A method according to any one of claims 1 to 8, wher 

ein a single stranded cDNA is ligated two a double stranded synthetic ol 
igonucleotide said linker, wherein the linker has a single stranded over 
hang encompassing a nucleotide sequence said tag, which was obtained fro 
m concatemers prepared by a method according to any one of the claims 1 
to 13, wherein the linker is attached to a selective binding substance a 
nd the selective binding substance is attached to a corresponding select 
ive binding substance said support, and where such linker bound to the s 
upport is used to enrich a specific nucleotide sequence said 1 st strand 
cDNA said RNA transcript. 

[m^M3 9] A method according to any one of claims 1 to 8, wher 
ein a single stranded cDNA is ligated two a double stranded linker said 
primer, where a selective binding substance is attached to said linker, 
and where selectively binding substance is attached to a corresponding s 
elective binding substance said support, and where such DNA template is 
used to obtain the nucleotide sequences of the 5' -region of an initial 
transcript said RNA. 
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[If 0] A method based on any of the claims 1-39 to be used 

for the development of diagnostic tools. 
3 Detailed Description of Invention 
[0 0 0 1] 

The present invention relates to a method to selectively collect multi 
pie nucleic acid fragments containing information on the nucleotide sequ 
ences at the 5' end site of multiple mRNAs within a 'sample. The method o 
f the present invention is effective for analyzing the mRNAs contained w 
ithin the sample, for discovering new genes, and for studies on gene reg 
ulation. 

[0 0 0 2] 
[ft*©fif] 

To utilize genomic information parts of the genome are transcribed int 
o mRNA. For the understanding of the genome and its use in regulatory pr 
ocesses, information on individual mRNA species is required, which shoul 
d include their partial or full-length nucleotide sequence and their rel 
ative or absolute quantity in a given biological context. 
[0 0 0 3] 

Conventionally, the base sequences in mRNAs contained in a cell or tis 
sue sample had been analyzed by preparing a cDNA library by reverse tran 
script ion, using mRNAs as templates and investigating the individual ins 
ert cDNA fragments within said cDNA library. Since a sample contains a 1 
arge number of varied mRNAs, the conventional method is of limited effi 
ciency to analyze gene expression profiles and to identify rare genes. T 
herefore other technologies have been invented to monitor the expression 
patterns of mRNA in complex samples and to identify genes by short sequ 
ence elements said tags. 
[0 0 0 4] 
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High-throughput expression profiling is commonly performed by the use 
of so-called DNA microarrays (Jordan B. , DNA Microarrays: Gene Expressio 
n Applications, Springer-Verlag, Berlin Heidelberg New York, 2001: Schen 
a A, DNA Microarrays, A Practical Approach, Oxford University Press, Oxf 
ord 1999). For such experiments specific probes representing individual 
genes or transcripts are placed on a support and simultaneously hybridiz 
ed with a plurality of samples- Positive signals will be obtained where 
a probe on the support reacts with a molecule presented with the sample. 
These experiments allow the parallel analysis of a large number of gene 
s or transcripts. However, the approach is limited to the fact that only 
genes or transcripts can be studied, which were initially identified by 
other experimental means. Such means can include cDNA libraries, partia 
1 sequence tags and/or results obtained from computer predictions. Due t 
o the limitations of DNA microarray experiments alternative approaches a 
re in use for gene discovery and expression profiling, which are based o 
n partial sequences said tags obtained from a plurality of mRNA samples. 
[0 0 0 5] 

Hie so-called SAGE (Serial Analysis of Gene Expression) method is know 
n as an efficient method of obtaining partial information on the base se 
quences in mRNAs (Velculescu V.E. et at., Science 270, 484-487 (1995)). 
This method forms DNA concatamers by ligating multiple short DNA fragmen 
ts (about 10 bp) containing information on the base sequences at the 3* 
end site of multiple mRNAs, and determines the base sequences in these D 
NA concatamers. It is a method for finding out partial information on th 
e base sequences at the 3* end site of multiple mRNAs. When only a short 
base sequence close to the 3* end is available but the mRNAs itself is 
already known, the SAGE method can often identify the mRNA, although th 
e available base sequence is as short as about 10 bp. This method is cur 
rently in wide use as an important method for analyzing genes expressed 
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in specific cells or tissues. 
[0 0 0 6] 

immmm LXit-rz mm] 

While the SAGE method can be used to learn a partial base sequence at 
the 3' end site of mRNAs, it is difficult to clone new genes based on th 
e information in such short sequences at the 3' end site alone. Despite 
the application, SAGE does not teach how to obtain cDNA clones close to 
the 5' end of the cDNA. In fact, 4 bp restriction enzymes of class lis a 
re used. A 4bp cutter usually cleaves on average a few hundred nucleotid 
es, which is on average l/10th of the average size of an mRNA transcript 
. Thus SAGE principles strongly suggest that 3' ends are collected with 
high prevalence, and no information can be collected about the 5' end fo 
r most of the transcript. In addition 10 bp tags have often been insuff 
icient for specific gene identification and mapping to genomic sequences 

said entire or partial genomes. Therefore, the 10 bp tags are used to i 
dent if y only a "sage-tag", which comprises a part of a mRNA. Notice that 

mammalian mRNA comprises only 3-5% of the transcribed part of the mamma 
lian genome and the specific "sage-tag" comprises a subfraction of this 
3-5%, which lies in proximity of the class lis restriction enzyme used i 
n the analysis. Since a 4 bp restriction enzyme cuts approximately a ran 
dom sequence every 4 4 bp (256 bp), the "sage-tags" can represent approxi 
mately 1/256 of the 3-5% expressed fraction of the genome (calculation^ 
ess than 0.02%). Therefore, the SAGE techniques teach that essentially i 
t is not possible to use SAGE-tags to analyze a genome but only a very 1 
imited fraction of it. 



Accordingly, the invention claimed in this application aims to provide 
new means that not only enables the acquisition of information on the b 
ase sequences of 5' -ends in mRNAs within a sample, but also enables the 



[0 0 0 7] 
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cloning of new genes and the analysis of genomic sequence information, w 
hich correspond to coding and regulatory regions. 
[0 0 0 8] 

This can include statistics on the DNA transcriptional starting site. 
By using concatamers to obtain information on a large number of 5'-seque 
nee tags as presented in the invention, it is possible to effectively ma 
p transcriptional start sites and the related the promoter sequences. Th 
us the invention provides new means, where SAGE did not allow any promot 
er analysis due to the use of unrelated 3' -ends. At the same time, there 

were techniques for the collection of full-length cDNA clones and seque 
nces derived thereof; however, those are focusing on collecting the full 
-length cDNA clones and not fragments covering the 5' -ends. Therefore fu 
11-length cDNA cloning approaches are not suitable for high throughput i 
dentif ication and analysis of start sites of transcription and the relat 
ed promoter regions. The invention offers here a novel way to combine co 
ntrasting teachings and to obtain by a high throughput approach 5' ends, 

which are useful for promoter mapping and analysis. The use of the inve 
ntion to study and analyze complex regulatory networks in combination wi 
th the ability to identify and clone new genes opens a wide area of appl 
ications for the invention to monitor biological systems and their statu 
s in development, homeostasis, and disease. 
[0 0 0 9] 

mm * mm- %> tz <o*$m 

After devoted research, the inventors involved in this application wer 
e able to complete the present invention by arriving at the fact that by 

selectively collecting multiple nucleic acid fragments containing infor 
mat ion on the base sequences at the 5' end site of the mRNAs, it is not 
only possible to acquire information on the base sequences in mRNAs, but 

it is also possible to clone new genes; and they were also able to arri 
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ve at a concrete method for attaining this goal. 
[0 0 10] 

That is, the present invention provides a method for preparing concate 
mers of a plurality of nucleic acid fragments having information on nucl 
eotide sequences of 5' -end regions of a plurality of mRNAs in a sample, 
comprising a first step of selectively collecting a plurality of first-s 
trand cDNAs containing regions complementary to 5' -end regions of mRNAs, 

which cDNAs are formed by using mRNAs in the sample as templates; a sec 
ond step of selectively collecting fragments containing cDNA regions inc 
luding at least the regions complementary to the 5' -end regions of said 
mRNAs; and a third step of ligating the collected fragments to form a co 
ncatemer. The present invention also provides a method for determining n 
ucleotide sequences of 5' -end regions of a plurality of mRNAs by sequenc 
ing said concatemer prepared by the method according to the present inve 
ntion. The present invention further provides a method, which is the sa 
me method according to any one of claim 1 to 10, except that preliminari 
ly obtained cDNAs having complete length is used instead of carrying out 

said first step. The present invention still further provides the cone 
atemer prepared by the method according to the present invention. The p 
resent invention still further provides a vector comprising said concate 
mer according the present invention. The present invention still further 

provides sequence tags derived form said concatemers prepared according 

to the present invention. The present invention still further provides 
means to use the sequences derived from said concatemers to analyze the 
content of the plurality of a RNA sample. Hie present invention still fu 
rther provides means to use the sequences derived from said concatemers 
to identify regions in the genome, which are required for gene regulatio 
n and gene expression. 
[0 0 11] 
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The invention is not limited to the use of concatamers for sequencing 
of 5' ends, but modifications at particular steps of the enrichment of 5 
' ends and their cloning as disclosed here allow for the individual sequ 
encing of specific 5' ends. Such embodiments of the invention would incl 
ude a modification of the first and second step, where a linker would be 
used that is specifically bound to a solid matrix. The cDNA bound to th 
e support would then be used to prepare the sequencing reactions. 



Thus the inventions refers more generally to the concept of isolating 
portions of nucleic acids corresponding to the 5' end of transcribed gen 
es and using them to further high- throughput analysis such as sequencing 



As described above, the method of the present invention can comprise b 
ut is not limited to roughly three steps each of which further comprises 
a plurality of steps. Each step will now be explained below. The concre 
te working examples of each step is described in detail in the later-men 
tioned working examples. 



Step 1 is a step to selectively collect nucleic acids said cDNAs conta 
ining a site corresponding to the 5* end site of mRNAs within a sample a 
nd which are synthesized for instance by using said mRNAs as templates. 
[0 0 15] 

Either total RNA or mRNA taken from a desired cell or tissue can be us 
ed as the starting substrate. The preparation method of total RNA and mR 
NA is already known, and it is also described in detail in the later-men 
tioned working examples. In other embodiments, a full-length cDNA librar 



[0 0 12] 



[0 0 13] 



[0 0 14] 



Step 1 
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y may be used to isolate the 5' end nucleic acids corresponding to the 5 
' end of the transcribed part of the genes. Alternatively, a cDNA librar 
y itself would be cleaved if it carries a Class lis enzyme in proximity 
of the 5' end. 



Step 1 itself can be conducted by a publicly known method. In other wo 
rds, methods to construct full-length cDNAs and methods to synthesize cD 
NA fragments at least containing a site corresponding to the 5' end site 
of the mRNAs are already known, and any of these methods can be adopted 
. One of the preferable methods is the cap trapper method (e.g. Piero CA 
RNINCI et al., METHODS IN ENZYMOLOGY, VOL. 303, pp. 19-44, 1999). This c 
ap trapper method shall be explained below, however, the invention is no 
t limited to the use of the cap trapper method and other approaches to e 
nrich or select full-length cDNAs could be applied as well. An alternati 
ve method (as described by Pelletier et al. in 1995) makes use of an imm 
obilized cap-binding protein to isolate full-length cDNAs after RNase tr 
eatment of a hybrid. 



Alternatively to the cap-selection, one could dephosphorylate with a p 
hosphatase, such as BAP (bacterial alkaline phosphatase) the 5' ends of 
mRNAs, followed by treatment with the decapping enzyme TAP (tobacco acid 

pyrophosphatase). Subsequently a ribonucleotide or a deoxyrubonucleotid 
e can be attached to the 5* end of the mRNA instead of the original cap- 
structure with RNA ligase (Maruyama and Sugano) In this way, for instanc 
e, a Class II S recognition site could be placed on the oligonucleotides 
/ribonucleotide sequence using during the ligation step, which is placed 

at the 5* end of a cDNA or RNA. This class II s restriction enzyme can 

then cleave the cDNA and produce the 5* end tag. 



[0 0 16] 



[0 0 17] 



[0 0 18] 
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Alternatively to biotin, a cap-binding protein (Pelletier et al. Mol C 
ell Biol 1995 15:3363-71) or an antibody that specifically binds to the 
cap structure can be used as the aforementioned selectively binding subs 
tance. 



Alternatively, one could use methods to attach oligonucleotides chemic 
ally to the cap structure as described by Genset. This method is based o 
n the oxidation of cap (US patent 6,022,715). This allows (1) adding to 
the cap an oligonucleotides, which may contain the ClassIIs enzyme, (2) 
preparing first-strand cDNA synthesis which then switched second strand 
cDNA synthesis; after the second strand synthesis, the cDNA would be cle 
aved with Class II s enzymes to make a 5* tag, for subsequent formation 
of the concatamer. 
[0 0 2 0] 

Alternatively, one could use the Use the cap-switch method as describe 
d by Clontech (US patent 5,962,272). One could prepare the first-strand 
cDNA in presence of a cap-switch oligonucleotide, which carries a recogn 
ition site for a substance capable to recognize nucleic acids and cleave 

them apart from the said recognition sequence such a Class II s restric 
tion enzyme site. The cap switch mechanism let the first strand sequence 

continue on the cap-switch oligonucleotides. This can be followed by se 
cond cDNA strand, possibly also followed by PCR (as describes for instan 
ce in the SMART™ Clontech cloning system), and finally it would be clea 
ved with the class II s to produce the 5* end TAGS. 
[0 0 2 1] 

In another embodiment, when the quality of RNA allows it, one can prep 
are the cDNA by priming and extending the RNA until the cap-structure. P 
articular enzyme and reaction condition allow sometimes reaching the cap 
-site very efficiently (Carninci et al, Biotechniques, 2002). Even witho 



[0 0 19] 
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ut a cap-selection it is possible to attach oligonucleotides in place of 

the cap structure, which carry Class lis restriction enzyme sites that 
would be later used to produce concatamers. 
[0 0 2 2] 

The cap trapper method first synthesizes the first-strand cDNA with a 
reverse transcriptase by using RNA as a template. This can be conducted 
by a known method. The cDNA can be primed with an oligo-dT primer or, wh 
en the template RNA is mRNA, it can be primed with a random primer. It i 
s advisable to add trehalose to the reactive solution because it raises 
the efficiency of reverse transcription reaction by stabilizing the reve 
rse transcriptase. It is preferable to use 5-methyl-dCTP instead of stan 
dard dCTP, because it avoids internal cDNA cleavage with several res trie 
tion enzymes and prevents unintended cleavage with restriction enzymes t 
o a considerable extent. In addition, after the first-strand cDNA synthe 
sis, proteins and digested peptides might be removed by CTAB (cetyl trim 
ethyl ammonium bromide) treatment, or other more general methods to puri 
fy cDNA. 

[0 0 2 3] 

Next, a selectively binding substance is bound to the cap structure of 
mRNA. A "selectively binding substance" here means a substance that sel 
ectively binds to a specific substance, preferably but not limited to bi 
otin. The cap structure is the structure at the 5' end of mRNA, which do 
es not exist in transfer RNA (tRNA) or ribosomal RNA (rRNA). Therefore, 
even if total RNA was used as the starting substrate, the selectively bi 
nding substance only binds to mRNA- In addition, the selectively binding 

substance does not bind to mRNA if the cap structure at the 5' end has 
been cleaved. Biotin can be bound to the cap structure by a known method 
. For instance, the cap structure can be biotinylated by first oxidizing 
the diol groups on the cap structure by treating mRNA with an oxidizer 
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such as NaI04 and making then react with biotin hydrazide. Alternatively 
, any other methods known to a person trained in the state of the art of 
the preparation of full-length cDNAs can be utilized to selectively enr 
ich 5' -ends according to the invention. 



Then, single-strand RNA is cleaved by means such as RNase I treatment. 

Any other RNase that can cleave single strand RNA but not cDNA/RNA hybr 
ids or cocktails of RNAses that can cleave the various single-strand RNA 
s sequences at various specificity can be used alternatively. In an RNA/ 
cDNA hybrid whose first-strand cDNA has not extended to the site corresp 
onding to the 5' end site of RNA, the vicinity of the 5* end of RNA is s 
ingle-stranded due to its failure to be hybridized with cDNA. Thus, the 
hybrid is cleaved at the single-stranded part and loses its cap structur 
e through this step. Consequently, this step leaves only those mRNA/cDNA 

hybrids with cDNA that fully extends to the 5' end of mRNA to maintain 
the cap structure. 



A matching selectively binding substance fixed to a support, which sel 
ectively binds to the aforementioned selectively binding substance, is p 
repared. In the present specification, a "matching selectively binding s 
ubstance" means a substance that selectively binds to the aforementioned 
selectively binding substance, which, in the case where the selectively 
binding substance is biotin, would be avidin, streptavidin or a deriva 
tive thereof that binds specifically to biotin or its derivatives. The s 
upport can favorably be, but is not limited to be, magnetic beads, parti 
cularly magnetic porous glass beads. Since magnetic porous glass beads t 
o which streptavidin has been fixed are commercially available, such com 
mercial streptavidin-f ixed magnetic porous glass beads can be used favor 
ably. Similarly other materials such as latex beads, latex magnetic bead 
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s, agarose beads, polystyrene beads, sepharose beads or alike could be u 
sed instead of porous glass beads. Furthermore, the invention is not lim 
ited to the use the biotion-avidin system but other binding substances c 
ould be used like a digoxygenin tag that would be attached to the cap st 
ructure and digoxygenin recognizing antibodies attached to a solid matri 
x . 



Following this, the aforementioned mRNA/cDNA hybrid with the cap struc 
ture is made to react with the aforementioned matching selectively bindi 
ng substance fixed to the support in order to bind the selectively bindi 
ng substance on the cap structure with the matching selectively binding 
substance on the support, thereby immobilizing the mRNA/cDNA hybrid with 

the cap structure on the support. When magnetic beads are used as the s 
upport, the magnetic beads can be quickly collected by applying a magnet 
ic force. As mentioned above, the mRNA/cDNA hybrids that have the cap st 
ructure at this stage are only those with cDNA that fully extends to the 

5' end of mRNA, so cDNAs containing a site complementary to the 5' end 
of mRNAs are selectively collected by this step, and Step 1 is completed 
. Meanwhile, in order to prevent non-specific binding to the support, it 

is preferable to treat the support with a large excess of DNA-free tRNA 

for blocking such binding before conducting this reaction. Other substa 
nces that are suitable for blocking the surface are nucleic acids or der 
ivatives, for instance total RNA or oligonucleotides; proteins, for inst 
ance bovine serum albumine; polysaccharides, for instance glycogen, dext 
ran sulphate, heparin or other polysaccharides. Alternatively, hybrid mo 
lecules containing parts of all of the above could be used to mask non-s 
pecific binding sites. 
[0 0 2 7] 

The above focuses on the case where Step 1 is conducted by the cap tra 
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pper method, but other various methods can also be used as indicated as 
long as they can selectively collect cDNAs containing a site complementa 
ry to the 5' end site of mRNA. 
[0 0 2 8] 

The following Step 2 selectively collects fragments containing a cDNA 
site that at least contains a site complementary to the 5' end site of m 
RNA. 

[0 0 2 9] 

First, the first-strand cDNA that has been immobilized on the support 
is released. It can be conducted by treating the support with alkali, su 
ch as NaOH. Alternatively to alkali, an enzymatic reaction with RNaseH ( 
which cleaves only the RNA hybridized to DNA) could be used. The alkali 
treatment releases the cDNA from the mRNA/cDNA hybrid, bound to the supp 
ort through the cap on the mRNA and separates the cDNA from the mRNA to 
only leave first-strand cDNA on its own. 



Then, a linker carrying a sequence that can be recognized in a sequenc 
e-specific manner by a substance having an enzymatic activity that cleav 
es the recognized DNA outside the recognition sequence. An example of su 
ch substance is a Class lis restriction enzyme. 



In this embodiment, a linker that at least carries a Class lis restric 
tion enzyme site, and a random oligomer part at the 3' end site, is liga 
ted to the end of this first-strand cDNA, which corresponds to the 5' en 
d of the aforementioned mRNA (i.e. the 3' end of the cDNA). For the late 
r cloning of the 5' end sequence tags into concatemeres it is preferable 
but not essential to introduce a second recognition site into the linke 
r, which should be distinct from the aforementioned recognition site use 
d for the e.g. Class lis restriction enzyme. 



[0 0 3 0] 
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[0 0 3 2] 

This can preferably be conducted as follows, by a method using a linke 
r that carries a Class lis restriction enzyme site and a random oligomer 

part (SSLLM (single strand linker ligation method), Y. Shibata et al., 
BioTechniques, Vol. 30, No. 6, pp. 1250-1254, (2001)). The Class lis re 
strict ion enzyme is a restriction enzyme group that causes cleavage at p 
arts other than the recognition site. An example includes but is not lim 
ited to the use of Gsul. Gsul treatment cleaves one of the strands at 16 

bp downstream from the recognition site, and the other strand at 14 bp 
downstream from the recognition site. Another suitable example is Mmel, 
which cleaves respectively 20 and 18 bases apart its recognition sequenc 
e. The random oligomer part is located at the 3' end site of the linker, 

and though the number of bases is not particularly restricted, the reco 
mmended number is 5 to 9, or more preferably, 5 to 6. The Class lis rest 
riction enzyme site should be located close to the aforementioned random 

oligomer part, so that the cleavage point comes within the cDNA, partic 
ularly relatively within the 5' side of the cDNA (i.e. the 3' side of th 
e template mRNA). The linker should preferably be a linker for double-st 
randed DNA of which the aforementioned random oligomer part protrudes to 

the 3' side and provides the binding end. In addition, it is advisable 
to bind a selectively binding substance such as biotin to the linker in 
advance to facilitate its collection later. 
[0 0 3 3] 

When the aforementioned first-strand cDNA is made to react with such a 
linker, the random oligomer part of the linker hybridizes with the 3' e 
nd site of the first-strand cDNA (i.e. the 5' end site of the template m 
RNA). Next, the second-strand cDNA is synthesized by using this linker a 
s a primer and the first-strand cDNA as a template. This step can be con 
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ducted by a standard method. 
[0 0 3 4] 



Then, the obtained double-strand cDNA is treated with the above Class 
lis restriction enzyme. This step produces a double-strand cDNA fragment 
comprising a linker-derived part and a part derived from the 5' end sit 
e of the cDNA (the 5' end site of the second-strand cDNA) . For instance, 
if Gsul were to be used as the Class lis restriction enzyme and if ther 
e were to be a linker designed to locate the restriction site immediatel 
y upstream from the aforementioned random oligomer site, the obtained DN 
A fragment would include a site derived from the site on the 5* end side 
of the second-strand DNA (i.e. the site on the 5' end side of the mRNA) 
of the length of 16 bp (however, the complementary strand is 14 bp). In 
the case of the use of Mme I the length of the second-strand DNA fragme 
nt should increase to 20 and 18 bp respectively. 
[0 0 3 5] 

Next, such DNA fragments are selectively collected. If a selectively b 
inding substance (e.g. biotin) had been bound to the linker as above, th 
e collection could be conducted similarly to Step 1 by using a support t 
o which a matching selectively binding substance (e.g. streptavidin) wou 
Id be fixed. This procedure completes Step 2, which selectively collects 
fragments containing a cDNA site, belonging to the first-strand cDNA, w 
hich at least contains a site complementary to the 5' end site of the af 
orementioned mRNA. 
[0 0 3 6] 

The above explains the case where the SSLLM is used for Step 2, but St 
ep 2 can also be carried out by any other method as long as the method c 
an selectively collect fragments containing the 3' end site of the first 
-strand cDNA (the 5* end site of the template mRNA). For instance, it is 
possible to use exonuclease that cleaves the nucleotide in the 5' -3* di 
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rection at a controlled speed. The exonuclease treatment of the first-st 
rand cDNA for a prescribed time period leaves a single-strand fragment c 
omprising the 3' end site of the first-strand cDNA (the 5' end site of t 
he template mRNA). It is possible to obtain only the targeted single-str 
and fragments by conducting treatment with a nuclease that only splits d 
ouble-strand fragments. These fragments can be collected, joined with ad 
apters and cloned. 



The subsequent Step 3 forms concatamers by mutually ligating the colle 
cted fragments. Since there are multiple mRNAs and the linker hybridizes 

with the first-strand cDNA at the random oligomer part as above, the ab 
ove method can obtain fragments containing multiple cDNAs derived from m 
ultiple mRNAs within a sample. Step 3 litigates these multiple fragments 

and forms concatamers. The ligation of the cDNA fragments can be carrie 
d out by a standard method, using commercial ligation kits. The ligation 

can be securely conducted but is not limited to a method which first is 

introducing a second linker providing a recognition site for a restrict 
ion enzyme that is distinct form the other recognition sites used at the 

earlier stages, which is then ligating two fragments into di-tags, and 
which is further ligating such ligated di-tag fragments into concatamers 
. The number of ligated fragments is not restricted, practically any num 
ber above two and preferably about 30. The obtained concatamers are pref 
erably but not limited to be amplified or cloned by a standard method. 
[0 0 3 8] 

The concatamers obtained in this way each comprise a site having the s 
ame base sequence (however, uracil in RNA would be thymine in DNA) as th 
at of the 5' end site of the multiple mRNAs within the sample. Although 
it also comprises a part derived from the linker or linkers, the base se 
quence of the linker or linkers is already known, so the part derived fr 
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om the linker or linkers and the part derived from mRNA can be clearly d 
istinguished by investigating the base sequence of the concatamer. There 
fore, by determining the base sequence of the obtained concatamer, it is 

possible to find out the base sequences at the 5* end site of multiple 
mRNAs within the sample. The base sequences of a maximum of 16 or 20 bas 
es at the 5' end site of each mRNA can be learned by the preferable mode 

of using Gsul or Mme I. Information on 16 or 20 bases would be sufficie 
nt for almost definitely identifying the mRNA statistically and to judge 

whether or not it is a new mRNA. In addition, by determining the base s 
equence of the concatamer, it is possible to learn the base sequences at 

the 5* end site of mRNAs for the number of above fragments included in 
the concatamer (preferably 20 to 30), so information on the 5' end site 
of multiple mRNAs can be determined efficiently. The analysis of the con 
catamers can be automated by the use of computer software to distinguish 

between sequences derived form the 5' -ends and sequences derived form a 

linker or the linkers. 
[0 0 3 9] 

When a new mRNA exists in a base sequence at the 5' end, the cDNA deri 
ved from the new mRNA can be obtained by conducting RT-PCR, making that 
site the forward primer and oligo-dT the reverse primer. It is also poss 
ible to amplify the mRNA by methods such as NASBA. Accordingly, the meth 
od of the present invention can be used for the cloning of new genes. Si 
milarly, forward primers derived form 5'-end specific information can be 
used to amplify partial or full-length cDNA fragments from exciting cDN 
A libraries. 

[0 0 4 0] 

While the above method had used mRNA or total RNA within the sample as 
the starting substrate, Step 1 can be omitted by using an existing full 
-length cDNA library. In this way, information on the base sequences of 
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the 5' end site of multiple cDNAs (i.e. the 5' end site of the mRNAs use 
d as templates for said cDNAs) contained in the full-length cDNA library 
can be efficiently obtained similarly to the above procedure. 



In some embodiments it could be desirable to obtain extended sequence 
information from the 5' -ends of transcribed regions. Such extended seque 
nces may allow in specific cases for the identification of start sites o 
f protein synthesis or a better mapping to genomic sequences. As describ 
ed above the invention included in Step 2 the ligation of a linker to th 
e 5' end of a cDNA. Such a linker can be modified by introducing a singl 
e-stranded overhang encompassing a sequence obtained from a concatamer t 
o bind to and to be ligation to a specific nucleic acid fragment. After 
the ligation the linker can be used to enrich the DNA fragment by attach 
ing the linker to a support from which it could be released after the en 
richment. The linker can further be used as a primer to obtain extended 
sequence information on 5' ends. 



By investigating the base sequences of the concatamers or extended 5'- 
sequences obtained by the present invention, it is not only possible to 
clone new genes as described above, but also possible to investigate the 

expression profiles of genes within the sample. Furthermore, the techno 
logy can be used for various purposes such as to map transcription start 

sites in the genome, to map promoter usage patterns, for the analysis o 
f SNPs in promoter regions, for creating gene networks by combining the 
expression analysis with information on promoters, alternative promoter 
usage and the other data, and for selective collection of the promoter s 
ite within fragmented genomic DNA. To select genomic fragments containin 
g promoter sites, a fragment containing the same base sequence as the 5' 

end site of mRNA could be bounded to a support e.g. by using the aforem 
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entioned Biotin system, and hybridized to fragmented genomic DNA. Hybri 
dized genomic DNA fragments could then be separated from a mixture of ge 
nomic fragments by using e.g. streptavidin-f ixed magnetic beads, and clo 
ned under standard conditions. 



Alternatively, one could avoid to make concatamers and use selected 5' 
end tags by ligating a mixture of full-length cDNAs to magnetic beads c 
arrying homogeneous sequence of oligonucleotides, followed by ligation s 
uch as in the SSLLM, second strand cDNA preparation and cleavage with a 
Class lis restriction enzyme. The 5' end specific tag would be anchored 
specifically to the beads and would be used for the specific sequencing 
as done by Lynx therapeutics (US patents 6,352,828; 6,306,597; 6,280,935 
; 6,265,163; 5,695,934). 
[0 0 4 4] 

For instance, oligonucleotides would have a "random part I" , which w 
ill bind to 5' ends of cDNAs; and a code part of the oligonucleotide, w 
hich will be able to "tag" the ligation product. The oligonucleotide m 
ay be destroyed by exonuclease VII if not hybridized with a cDNA. The 
decoder" oligonucleotides would be used to select out the sequence. The 
specific arrays of cDNAs on beads are then arrayed onto a solid surface 
, one per position, followed by parallel sequencing. If you look at 1 ho 
le per 1 bead, you can make arrays of beads having specific oligonucleot 
ides. 



By modifications as the aforementioned approaches for direct sequencin 
g of 5' end the invention provides different means for the general analy 
sis of 5* ends in the form of concatamers or the analysis of individual 
5' ends, which were enriched by means of a 5' end specific selection. 
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The present invention will now be described by way of examples thereof 
. It should be noted that the present invention is not restricted to the 

Examples. The experiments describe in the Examples can be performed by 
any person experienced in the state of the art of standard techniques in 

the field of Molecular Biology. Unless otherwise defined in the text, t 
he technical terms, abbreviations, and solutions used in the Examples sh 
ould have the same meaning as commonly understood by a person experience 
d to the state of the art in the field of the invention. A general descr 
iption of such terms, abbreviations and solutions can be found in the co 
mmon reagent section in Molecular Cloning (Sambrook and Russel, 2001). A 
11 publications mentioned herein are incorporated into this document by 
reference to be disclosed and to describe the methods and/or materials t 
herein. 

[0 0 4 7] 

Example 1: 

Preparation of total RNA from tissue 

In the literature a variety of different approaches for the preparation 
of RNA have been described, which are known to a person experienced in t 
he state of the art. All such approaches should allow the preparation of 
a plurality of RNA samples derived from biological materials including 
tissues and cells, which are suitable for the invention. Below two such 
procedures are described in detail. 
Buffers and solutions: 

a) Solution D: 4M guanidinium thyocyanate, 25mM sodium citra 
te (pH7.0), lOOmM 2-mercaptoethanol and 0.5% n-lauryl-sarcosine. 

b) RNase-free CTAB/UREA solution: 1% CTAB (Sigma), 4M UREA, 
50mM Tris-HCl (PH 7.0), ImM EDTA (pH 8.0). 

c) Water equilibrated phenol as described in Molecular Cloni 
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ng (Sambrook and Russel, 2001). 

Phosphate-buffer saline (PBS) as described in Molecular Cloning (Sambro 
ok and Russel, 2001) 
5 M Sodium chloride 
7 M Guanidium choride 
Rnase free dd-water 

[0 0 4 8] 
Protocol for total RNA preparation 

Dissect the tissue as fast as possible in a cooled dish. 

Roughly evaluate the volume of tissue in a 50 ml falcon tube. Hie best q 

uantity of tissue is between 0.5-1 g of tissue for 20 ml Solution D 

Add 2 ml of 2M sodium acetate (pH 4.0) and 16 ml of water-equilibrated p 

henol. 

Mix by a vortex. Add 4 ml of chloroform and shake vigorously by your han 
ds and a vortex. Let it stay on ice for 15 min. 
Centrifuge it at 6,000 rpm for 30 min at 4 °C 

Transfer the upper aqueous phase to new tube by pipetting (25 ml) and re 
cover approximately 20 ml thereof. 

Precipitate the RNA from the aqueous phase by adding 1 equal volume of I 
sopropanol (in this case, approximately 20 ml), store on ice for 1 h. 
Centrifuge at 7,500 rpm for 15 min at 4 °C: RNA is pelleted by centrifug 
at ion. 

The pellet is washed twice with 70% ethanol, each time followed by centr 
ifugation at 7,500 rpm for 2 min, in order to remove the SCN salts. 
CTAB removal of polysaccharides. Selective CTAB precipitation of mRNA is 
performed after complete RNA re-suspension in 4 ml of water. Subsequen 
tly, 1.3 ml of 5 M NaCl is added and the RNA is then selectively precipi 
tated by adding 16 ml of a CTAB/urea solution. 

Centrifuge for 15 min at 7500 rpm (9500 x g), discard the aqueous phase. 
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Resuspend the RNA pellet in 4 ml of 7 M Gunidinum Cloride. 
Re-suspended RNA is finally precipitated by adding 8 ml of ethanol. Incu 
bate on -20°C for 1-2 hours (or longer) and centrifuge for 15 min at 7,5 
OOrpm, 4X3. At the end, wash the pellet with 5 ml of 70% ethanol. 
Centrifuge again at 7,500 rpm for 5 min. 
Discard the supernatant. 

Re-suspend RNA in 500-1000 microL of RNase-free dd-water. 
[0 0 4 9] 

Preparation of a mRNA fraction from total RNA 
The mRNA fragtion of total RNA preparations can be isolated by the use 
of commercial kits such as the MACS mRNA isolation kit (Milteny) or pol 
yA-quick (Stratagene) , which provide satisfactory yield of mRNA under th 
e recommended conditions. One cycle of oligo-dT selection of the mRNA is 
sufficient. It is advisable to redissolve the poly-A + RNA at a high con 
centration of 1 to 2 microG/microL. 
[0 0 5 0] 

Preparation of a plurality of RNA samples from a cDNA library 

Alternatively, a plurality of nucleic acids corresponding to the 5' en 
ds of genes can be obtained from existing cDNA libraries, which were clo 
ned into expression vectors. By standard methods known to a person famil 
iar with the state of the art of molecular biology approaches, from such 
libraries RNA transcripts can be obtained by in vitro transcription rea 
ctions using e.g. a T3, T7 or SP6 RNA polymerase. Such an approach can b 
e performed by first linearization of the plasmid DNA with appropriate r 
estriction endonuc leases. The restriction enzyme can be chosen to allow 
for the transcription of the sense RNA. In the case of libraries obtaine 
d in the vector pFLC III (Carninci P, Shibata Y, Hayatsu N, Itoh M, Shir 
aki T, Hirozane T, Watahiki A, Shibata K, Konno H, Muramatsu M, Hayashiz 
aki Y. ) Balanced-size and long-size cloning of full-length, cap-trapped 
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cDNAs into vectors of the novel lambda-FLC family allows enhanced gene d 
iscovery rate and functional analysis, Genomics, 2001 Sep ; 77 (1-2) : 79-90) 
, the vector can be linearized by cleavage with one of the homing endonu 
cleases I-Ceu I or Pl-Sce I to avoided a truncation of the inserts. For 
the digest mix in a tube 
Plasmid DNA 100 microG 
lOx buffer 40 microL 
Restriction enzyme 100 u 

ddH20 ad 400 microL 

Incubate at appropriate temperature for at least 2h and analyze 1 microL 
of the reaction mixture by agarose gel electrophoreses. If the digest i 
s completed, add: 
0.5 M EDTA 8 microL 
10% SDS 8 microL 

Proteinase K (10 mg/ml) 5 microL 

Incubate for 15 min at 45°C before extracting sample with 500 microL phe 
nol/chloroform. The aqueous phase is to be re-extracted twice with 500 m 
icroL chloroform. Finally linearized DNA is precipitated with isopropano 
1 or ethanol under standard conditions and dissolved in 50 microL TE. 

[0 0 5 1] 
In vitro RNA synthesis: 
Mix in a tube under Rnase free conditions: 



Linearized plasmid DNA 



20 microG 



5x T7 or T3 buffer 



200 microL 



0. 1 M DTT 



100 microL 



2 mg/ml BSA 



40 microL 



10 mM rNTPs 



50 microL 



T7 or T3 RNA polymerase 



10 microL 



ddH20 ad 1000 microL 
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Incubate at 37°C for 3 to 4 h before adding: 



10 mM Calcium Chloride 



10 microL 



lU/microL DNase RQ1 5 microL 



Incubate at 37°C for 20 min before adding: 



0.5 M EDTA 



10 microL 



10 mg/ml Protease K 5 microL 



Incubate at 45°C for 30 min, before addition of Sodium Chlorid to a fina 
1 concentration of 1M. Phenol/Chloroform extraction followed be re-extra 
ction with Chloroform should be performed under standard conditions, and 
the RNA transcripts can be finaly collected by Isopropanol or Ethanol p 
recipitation. The pellet is to be resuspended in 200 microL of water or 
TE. The quality of the RNA transcripts should be confirmed by agarose ge 
1 electrophorese and quantification. 



Buffers and solutions 

Saturated Trehalose, about 80% in water (crystals will remain), low meta 
1 content 

4.9 M high purity sorbitol 
Optionally: Takara GC-Taq buffer 

[0 0 5 3] 
Enzymes and buffers 

RNase H~ reverse transcriptase Superscript II (Invitrogen) and buffer or 
other reverse transcriptases. 
[0 0 5 4] 
Nucleic acids and oligonucleotides 

Purified, first-strand oligo-dT primer (Sequence for primer used: 

5' -aGAGAGAGAGGATCCTTOXJGAGAGTTTTm ' ) . Al ternat ively or a 

dditionally, random primer (dNg-dNg), where N is any nucleotide. 



[0 0 5 2] 



2. 



: First strand cDNA synthesis 
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mRNA, recommended 2.5 to 25 microG or alternatively, total RNA, 5-50 mic 
roG 

[0 0 5 5] 

Radioactive compounds 

[alpha-32p] dGTP 

[0 0 5 6] 

Protocol A: Trehalose-Sorbitol enhanced 

To prepare the 1 st strand cDNA, put together the following 
reagents in three different 0.5 ml PCR tubes (A, B, and C) 
[0 0 5 7] 

Tube A: in a final volume of 21.3 microL, add the following: 

mRNA 2.5-25 microG 

or total RNA, 5-50 microG 

1st strand primer (2 microG/microL) 14 microG (7 microL) 

Total volume: 22 microL 

Heat the mixture (mRNA, primer) at 65<>C for 10 min to dissolve the secon 
dary structures of mRNA. 

Tube B: in a final volume of 76 microL, add the following: 
5X I s * strand buffer 28. 6 microL 

0. 1 M DTT 11 microL 

dATP, dTTP, dGTP, and 5-methyl-dCTP 10 mM each 9.3 microL 

4.9 M sorbitol 55.4 microL 

Saturated trehalose 23.2 microL 

RNase H" Superscript II reverse transcriptase (200 U/microL) 

15.0 microL 
Final volume: 142.5 microL 
[0 0 5 8] 

Prepare a cycle (on a thermal cycle) with: 40<>C, 4 min; 50°C, 2 min; 56° 
C, 60 min. 
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If total RNA is used as the starting material, prepare a cycle with: 
40OC, 2 min, -0.1<>C/sec to 350C; 50OC, 2 min; 56<>C, 60 min. 
Alternatively: prime the cDNA with a random primer (dN 9 > N= any nucleoti 

de) at 250 c. 

[0 0 5 9] 

Tube C: 

1-1.5 microL of [alpha-32p] dGTP. 

[0 0 6 0] 

For a cold-start operate as follows: 

Quickly mix tubes A and B on ice. 

Transfer in tube C 40 microL of the A+B mixture. 

Tubes A+B and C should be quickly transferred immediately at 40<>C of the 
step 1 of the above cycling program to anneal at 40<>C four 4 minutes. 
Let the reaction proceed following the thermal cycler setting. 

[0 0 6 1] 
For a hot-start, operate as follows: 
Transfer the tubes A, B, C on the thermal cycler 
Start the cycling 

When the temperature reaches 42<>C, quickly mix tubes A and B. 
Transfer in tube C 40 microL of the A+B mixture. 
Let the reaction proceed following the thermal cycler setting. 
[0 0 6 2] 

Protocol B: GCI-Trehalose-Sorbitol enhanced 

Tube A: in a final volume of 22 microL, add the following: 

niRNA 5-25 microG 

(precipitate with ethanol and re-suspend directly with the primer) 
or total RNA, up to 50 microG (for the small-scale protocol) 
Purified 1 st strand cDNA primer (2 microG/microL) 14 microG(7 microL) 
Final volume: 22 microL 
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Tube B: add the following: 

2 X GC I (LA Taq) buffer (TaKaRa) 75microL 

dATP, dTTP, dGTP, and 5-methyl-dCTP, 10 mM each 4 microL 

4.9 M sorbitol 20 microL 

Saturated trehalose (approximately 80%) 10 microL 

Superscript II reverse transcriptase (200 U/microL) 15 microL 

ddH20 4 microL 

Final volume: 128 microL 

Tube C: 

alpha- 32 P-dGTP 1.5 microL 

For the rest of the procedure, follow exactly the point as 
in the normal reaction condition. Prepare (in advance) a thermal cycler 
with the following cycle: 

42<>C, 30 min; 50°C, 10 min; 55°C, 10 min; 4<>C, indefinite time. 

[0 0 6 3] 
Operate as follows: 

1) Transfer the tubes A, B, C on the thermal cycler 

2) Start the cycling 

3) When the temperature reaches 42<>C, quickly mix tubes 



A and B. 



r setting. 



cent rat ion. 



4) Transfer in tube C 40 microL of the A+B mixture. 

5) Let the reaction proceed following the thermal cycle 

At the end, stop the reaction with EDTA at 10 mM final con 



Then incorporation of [alpha3 2 P]GTP is measured and the yi 
eld of cDNA is calculated. Calculation of the amount of cDNA by measuri 
ng [alpha 32 P]GTP is useful for monitoring whether the processes are accu 
rately proceeding or not. 
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[0 0 6 4] 



3. 



CTAB precipitation of the first-strand cDNA 



Buffers and solutions 



CTAB solution as described in Example 1 



After measuring the radioactivity, transfer both the "hot" and "cold" is 
t strand synthesis (tube B and C) to a tube and perform CTAB precipitati 
on as follows. 

Mix the tube B and C from the first strand; to the mixture add: 
3 microL of 0.5 M EDTA (final concentration of 10 mM) 
2 microL of 10 microG/microL Proteinase K. 

Incubate at 45°C or 50°C for at least 15 min, and as long as 1 hour. 

To the 128-142 microL volume of the first strand cDNA reaction, add: 

32 microL of 5 M Sodium Chloride (RNase free) 

320 microL of CTAB-Urea solution 

Incubate at room temperature for 10 min. 

Centrifuge at 15,000 rpm for 10 min 

Remove supernatant. 

Carefully re-suspend with 100 microL of 7M Guanidinium Cloride 
Add 250 microL of ethanol and leave on ice or 20 to -80°C for 30-60 min 
Centrifuge at 15,000 for 10 min. Remove the supernatant. 
Subsequently, wash the pellet twice with 800 microL of 80% ethanol. Each 
time, add 80% ethanol to the tube and centrifuge for 3 min. at 15,000 r 
pm. 

Re-suspend cDNA in water 46 microL. 



[0 0 6 5] 



4. 



Cap-trapping, oxidation and biotinylation of the cap 



Buffers and solutions 



1 M sodium acetate buffer, pH 4.5 
1M citrate buffer, pH 6.0 
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NaI04, solution >100 mM. 
SDS 10% 

Biotinylation buffer: 33 mM Sodium citrate, pH 6.0, and 0.33% SDS. 

10 mM Biotin Hydrazide long arm (MW = 371.51; 3.71 mg/ml = 
10 mM) in citrate/SDS buffer. 
Cap biotinylation: (A) Oxidation of the diol groups of mRNA 



In a final volume of 50 to 55 microL, add the following: 

The re-suspended cDNA sample 

3.3 microL of 1 M sodium acetate buffer, pH 4.5 

A freshly prepared solution of NaI04 to a final concentration of 10 mM 
Incubate on ice in the dark for 45 min. 
Finally, precipitate the cDNA: 

To simplify the downstream process, add 1 microL of glycerol 80%. 
Vortex. 

Add 0.5 microL of 10% SDS, 11 microL of 5 M sodium chloride and 61 micro 
L of isopropanol. 

Incubate at 20 or -80°C for 30 min in the dark. 

Centrifuge for 15 min at 15,000 rpm. 

Remove supernatant. 

Add 500 microL of 80% ethanol 

Centrifuge at 15,000 rpm for 2-3 min. 

Discard the supernatant 

Repeat steps 12-13 

Re-suspend the cDNA in 50 microL of water. 

Biotinylation: (B) Derivatization of the oxidized diol groups 

To the cDNA (50 microL), add 160 microL of the dissolved biotin hydrazid 

e long arm in the reaction buffer. Perform the reaction in 210 microL (f 

inal volume). 
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Incubate overnight (10-16 hours) at room temperature (22-26°C). 
Subsequently, to precipitate the biotinylated cDNA, add: 
75 microL 1 M Sodium citrate, pH 6. 1 

5 microL of 5 M Sodium chloride 
750 microL of absolute ethanol 

Incubate on ice for 1 hour or at 80 or -20°C for 30 min or longer. 
Centrifuge the sample at 15,000 rpm for 10 min 
fash the precipitate twice with 70% or 80% ethanol and centrifuge. 
Discard the supernatant and repeat the wash. Dissolve the cDNA in 175 mi 
croL of TE (1 mM Tris, pH 7.5, 0.1 mM EDTA). 

Cap-trapping and releasing the 5* ends of cDNA Enzymes and buffers 
RNase ONE (Pr omega) and its reaction buffer 



To the cDNA sample add, in a final volume of 200 microL: 
20 microL of RNase I buffer (Promega). 

1 units of RNase I (Promega, 5 or 10 U/microL) per each 1 microG of star 
ting mRNA or total RNA (in case of small scale protocol) used for first 
strand cDNA synthesis. 
Incubate at 37°C for 30 min. 

To stop the reaction, put the sample on ice and add 
4 microL 10% SDS and 

3 microL of 10 microG/microL Proteinase K. 
Incubate at 45°C for 15 min. 

Extract once with 1:1 Tris-equilibrated phenol : chloroform, then load the 

aqueous phase into Microcon -100. 
Perform a back extraction with water and load again into the Microcon-Ce 
ntricon 100 filter. 

Perform one round of Microcon separation 

8-b) Dissolve completely the pellet with 20 microL of 0. 1 x TE 
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Magnetic beads blocking 
Materials 

Streptavidin-coated MPG (CPG inc. , New Jersey) 
Buffers and solutions 

Binding buffer: 4.5 M NaCl, 50 mM EDTA, pH 8.0 
Special equipments 

A magnetic stand to hold 1.5 ml tubes is required. 

To further minimize the non-specific binding of nucleic acids, magnetic 

beads are pre-incubated with DNA-free tRNA (lOmg/ml). 

For each preparation, pre-incubate 500 microL of magnetic beads (per 25 

microG of starting mRNA) with 100 microG of tRNA. 

Incubate on ice for 30 min with occasional mixing. 

Separate the beads with a magnetic stand (for 3 min) and remove the supe 
mat ant. 

Wash for 3 times with 500 microL of binding buffer 

[0 0 6 7] 
5' -ends cDNA capture and release 

To capture the full-length cDNA, mix the RNasel-treated cDNA and wash be 
ads as follows: 



1) 



Re-suspend the beads in 500 microL of wash/binding b 



uffer. 



2) 



Transfer 350 microL of the beads into the tube conta 



ining the biotinylated first-strand cDNA. 

3) After mixing gently rotate the tube for 10 min at 50 
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°c, 



4) 



Transfer 150 microL of the beads into the tube conta 



ining the biotinylated first-strand cDNA and 350 microL of beads. 



Separate the beads from the supernatant on a magnetic stand, 
fashing the beads 

Gently wash the beads with 0.5 ml of the indicated buffer to remove the 
nonspecif ically absorbed cDNAs. 
2 x with washing/binding solution. 

1 x with 0.3 M NaCl/ ImM EDTA 

2 x with 0.4% SDS/ 0.5 M NaOAc/ 20 mM Tris-HCl pH 8.5/ ImM EDTA. 
2 x with 0.5 M NaOAc/ 10 mM Tris-HCl pH 8.5/ ImM EDTA. 

Alkali release (see below) 

Alkali full-length cDNA release from beads 

Add 100 microL of 50 mM NaOH, 5 mM EDTA. 

Briefly stir and incubate 5 min at RT with occasional mixing. 

Separate the magnetic beads and transfer the eluted cDNA on ice. 

Repeat the elution cycle with 100 microL of 50 mM NaOH, 5 mM EDTA, two m 

ore times until most of the cDNA, 80-90% as measured by monitoring the r 

adioactivity, can be recovered from the beads. 

Adding a 5' -end primable site to the cDNA 

RNase step 

Enzymes and buffers 

- RNase ONE™ and its buffer (Promega) 

Add 50 microL of 1 M Tris-HCl, pH 7.0 in tubes on ice and mix quickly. 
Add 1 microL of RNase I (lOU/microL) and mix quickly. 

Incubate at 37 °C for 10 min. 
To remove the RNasel, treat the cDNA with Proteinase K and phenol/chloro 



5) 



After mixing gently rotate the tube for 20 min at 50 
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form extraction including back extraction. 

Add 3 microG of glycogen. Treat the cDNA with one cycle of Microcon-100. 

Fractionation of cDNA before adding a primable site 

Materials 

Amersham-Pharmacia S-400 spun kit or alternative kits 
Buffers and solutions 

Column buffer: 10 mM Tris, pH 8.0, 1 mM EDTA, 0.1 % SDS, and 100 mM NaCl 
Column buffer without SDS: 10 mM Tris, pH 8.0, 1 mM EDTA and 100 mM NaCl 

S-400 spun column chromatography 

Detailed protocols are described in the kits. This is the running protoc 
ol of S-400 spun columns. 
Shake the column 

Brake the seal and transfer in a 2 ml tube 
Centrifuge at 3,000 rpm 1 min (+ 4 °C) 
Add the cDNA (< 20 microL volume) 
After cDNA, add 80 microL of water 
Centrifuge 2 min at 3000 rpm 

Concentrate by Microcon 100 or precipitate with isopropanol. Recovery s 
hould exceed 80%. 

[0 0 6 8] 
6. SSLLM 
Materials 

S-300 spun column chromatography kit (Amersham-Pharmacia) 
Buffers and solutions 

Column buffer: lOmM TrisHCl pH 8.0, ImM EDTA, 0.1% SDS, lOOmM NaCl. 

Enzymes and buffers 

Takara DNA Ligase KIT II. 

Nucleic acids and oligonucleotides 
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In the Example given here, the recognition sites for the restriction enz 
ymes Bgl II, Gsu I and Mme I are introduced, however, the invention is n 
ot dependent or limited to the use of those restriction enzymes and thei 
r recognition sites. In particular, Bgl II (recognition site: AGATCT) ca 
n be replaced by any endonuclease suitable for cloning. Other example fo 
r such enzyme could include Asc I (recognition site: GGCGCGCC) or Xba I 
(recognition site: TCTAGA). 

Synthesize the following oligonucleotides containing the Gs 

ul restriction site. 

0 1 i gonuc 1 eot i de Bg-Gsu-GN5 : 

5' -B i o t i n-AGAGAGAGMCTAGGCTTMTAGGTGACTAGATCTG^ ' ; 

0 1 i gonuc 1 eot i de Bg-Gsu-N6 : 

5' -Biotin-AGAGAGAGMCTAGGCTTMTAGGTGACTAGATCTGGAGNNN^ ; 
01 igonucleot ide Bg-Gsu-down: 

5' P-CTGGAGATCTAGTCACCTATTM(X:CTAGTTCrCTCTCT-NH2 3' . 

Synthesize the following oligonucleotides containing the Mm 
e I restriction site. 
01 igonucleot ide Bg-Mme-GN5 : 

5' -Biotin-AGAGAGAGMCTAGGOTMTAGGTGACTAGATCT^ ; 
01 igonucleot ide Bg-Mme-N6 : 

5' -Biotin-AGAGAGAGMCTAGGCTTMTAOnGAC^^ ; Oligonuc 

leotide Bg-Mme-down: 

5' P-GTYGGAGATCTAGTCACCTATTMGCCTAGTTCTCTCT^ 3' . 
Where R stands for G or A and Y stands for C or T. 

P means that the oligonucleotide must be 5' phosphorylated and NH2 indie 
ates that an amino-group is added to avoid non-specific ligation and pos 
sible hairpin priming. 

Oligonucleotides should be purified by acrylamide gel electrophoresis fo 
1 lowing standard techniques as the first-strand cDNA primer with 10% acr 
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ylamide electrophoresis (Sambrook and Russel, 2001). Oligonulceotides sh 
ould be extracted with phenol/chloroform, chloroform and precipitation w 
ith 2 volumes of ethanol as for the first-strand cDNA primer. 

[0 0 6 9] 
Preparation of the linkers. 

After 0D checking and mixing Bg-Gsu-GN5, Bg-Gsu-N6 and "down" oligonucle 
otides at ratio 4:1:5, at least 2 microG/microL of DNA; add NaCl at 100 
mM final concentration. The oligonulceotides are annealed at 65°C for 5m 
in, 45°C for 5min, 37°C for lOmin, 25°C for lOmin. 

[0 0 7 01 
Ligation of the first-strand cDNA 

Use 2 microG of linker mixture for up to 1 microG single-strand cDNA. Mi 
x linkers and cDNA (final volume: 5 microL) 

Heat at 65°C for 5min to melt secondary structures of single-strand cDNA 
Transfer the linker and cDNA mix on ice. 

Add 5 microL of the solution II from the TAKARA DNA ligation Kit. 
Add 10 microL of solution I of the kit. 
Incubate at 10°C overnight (at least >10 hours) . 

At the end of the ligation reaction, stop the reaction by adding ImicroL 
of 0. 5 M EDTA, 1 microL of 10% SDS, ImicroL of 10 mg/ml Proteinase K, 1 
0 microL of water, and incubate at 45°C for 15 min. 

Treat with phenol/chloroform, chloroform and back extract (see appendix) 
with 60 microL of column buffer 

After the ligation, remove the excess linker with S-300 spin column chro 
matography 

1) Shake the column several times and then let it stand upright. 

2) Remove the upper cap, then the bottom one. 

3) Drain the buffer of the column. Apply 2 ml of the column buffer and d 
rain twice by gravity. 
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Put the column into a 15 ml centrifuge tube, then centrifuge at 400 x g 
for 2 min in a swing-out rotor at room temperature. 

Apply 100 microL of buffer to the column, then centrifuge at 400 x g for 
2 min. Check the eluted volume. If it is different from the input (100 
microL), repeat this step until the eluted volume is the same as the add 
ed one. 

Set a 1.5 ml tube, after cutting off the cap, into the 15 ml centrifuge 
tube, and then apply the sample into the column. Centrifuge at 400 x g f 
or 2 min. 

Collect the eluted fraction in a separate tube. Apply to the column 50mi 
croL of buffer, repeat the centrifugation and collect the fraction in a 
separate tube. 

Repeat step 6 for 3 to 5 more times; keep the eluted fractions separate. 
Collected fractions should be counted in a scintillation counter. Usual 1 
y mix the first 2-3 fractions (80% of cpm of cDNA). 

Add NaCl to a final concentration of 0.2 M, precipitated the cDNA by add 
ing equivalent of isopropanol. 

After precipitation and washing twice with 80% cold ethanol, re-suspend 
with water. 
Second-strand cDNA 

Setting the 2nd strand cDNA program on the thermal cycler as follows: 
Step 1 5 min at 65 <>C 

Step 2 30 min at 68 °C 

Step 3 72 °C for 10 min 

Step 4 +4<>C 

Procedure for the second strand cDNA 
Second strand steps, mix in a test tube: 
The cDNA 
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6 microL of LA-Taq polymerase buffer (Takara) 
6 microL of 2. 5 mM (each) dNTP' s (Takara) 

0.5 microL of [alpha- 32 P] dGTP (optional to follow the incorporation) 

After starting the 2nd strand program, put the tube on the thermal cycle 
r. 

Add to tube 3 microL of 5 U/microL of LA Polymerase or alternative therm 
ostabe polymerase cocktails, when the samples are at 65°C, during the fi 
rst step. 

Mix quickly but thoroughly 

At the end of the cycle of the thermal cycler, stop the reaction by addy 
ing 10 mM EDTA (final concentration) and clean up the reaction by Protei 
nase K treatment, Phenol -chloroform extraction and ethanol precipitation 
(see Sambrook and Russel, 2001, Molecular Cloning, CSHL press, NY). 

[0 0 7 1] 

11. Cleavage of cDNA 

The cDNA should then be cleaved with the Class lis restriction enzyme li 
ke Gsu I given in this Example. 

Buffer (10X) (MBI Fermentas) 10 microL 

GsuKlU/microL) (use 5U/microG DNA) Y microL 



ddH20 X microL 

Final volume 100 microL 

Hi/here the Y and X vary depending on the quantity of cDNA 

1) Incubate at 37°C for 1 hour. 

2) Added 0.5M EDTA 2 microL. 

3) Incubated at 65 "C for 15 min. to inactivate the enzyme 



Prepare the magnetic beads 

Prepare the appropriate quantity of CPG-MPG (Magnetic porous glass beads 
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). Hie same considerations made for the cap-trapper step are valid at th 
is point. 

Prepare 200 microL of GPG- beads. 
Add 5 microG of tRNA (20 mg/ml). 

Incubate at RT for 10-20 min or on ice for 30-60 min, with occasional sh 
aking 

Transfer the beads on a magnetic stand for 3 minutes and remove the aque 
ous phase. 

Wash 3 times with: 1M NaCl, 10 mM EDTA use at least a volume equivalent 
to the starting volume of beads. 

Re-suspend beads in 1M NaCl, 10 mM EDTA equivalent to the starting volum 
e of beads. 



Mixed washed beads and Gsul cut sample. 

Incubate at RT for 15 min with occasional gentle mixing 

Let it stand on magnetic rack for 3 min. 

Recover the supernatant. 

Rinse 4X with 500 microL of IX B&W buffer (binding and washing buffer= 5 
mM Tris, pH 7. 5, 0, 5 mM EDTA, and 1 M NaCl) containing IX BSA (bovine s 
erum albumin) wash. 

Wash 2X with 200 microL of IX ligase buffer (NEB). 
[0 0 7 3] 

8. Ligating linkers to bound cDNA: II linker ligation. 

In this Example a linker with a recognition site for the restriction enz 
yme Eco RI is used. However, the invention is not dependent or limited t 
o the use of Eco RI in the second linker. Any other restriction enzyme a 
nd its recognition site can be used depending on their convenience for c 
loning the concatamers. 



[0 0 7 2] 



7. 



Released cDNA tags 
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Oligonucleotides to be synthesized: 
5 ' -GAGAGAGAGACTTTAGGTGACACTATAGMGAGTC(nX^GMTTCNN-3 ' 

5' -P-GAATTCTCAGGACTCTTCTA^ 

The oligonucleotides are purified and annealed as described for the Link 
er 1. 

LoTE (1 mM Tris, pH 7.5, and 0. 1 mM EDTA) 20 microL suspended and add li 
nker II (0.4 microG/microL) 

Heat the tube at 65 V for 5min, then let sit at room temperature for 15 
min. 

Add TaKaRa ligation kit II solution II 25microL and solution I 50microL. 
Incubated at 16 *C overnight. 

After ligation, wash 4 times with 500 microL IX B&W buffer containing IX 
BSA. 

Wash once with 200 microL IX B&W buffer and twice with 200 microL lXBgll 
I buffer containing IX BSA. 
[0 0 7 4] 

Release of cDNA tags using the Tagging Enzyme 
Add to the sample the following 



LoTE 



X microL 



10X buffer 



10 microL 



Bgl II Y microL 



Make up the volume to a total of 100 microL. 

1) Incubate at 37*0 for 1 hour, gently mixing intermittently. 

2) Place on magnet, collect supernatant into new tube. The su 
pernatant contains the released 5' end fragments. 
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3) 



Raise volume to 200 microL with LoTE. 



To 200 microL of sample (the 5' ends, tagged with linkers) add: 

133 microL 7. 5M NH40ac 

3 microL ImicroG/microL glycogen 

340 microL Isopropanol 

Incubate at 20 or -80^ for at least 30 min. 

Spin for 20min at 4*C at 15,000 rpm in a micro-centrifuge. Remove the su 
pernatant. Wash the pellet twice with 80% or 70% ethanol. Centrifuge for 
3 min at 15,000 rpm and removed the ethanol wash. At the end, re-suspen 
d in 10 microL LoTE. 

[0 0 7 5] 
Ligating tags to form di-tags 
The 5' ends of cDNAs are ligated to form di-tags. 

1) Add the TaKaRa ligation Kit II solution II 10 microL and s 
olution I 20 microL. 

2) Incubate overnight 16X3. 

3) Added 10 microL of ddH20, 1 microL of 0. 5M EDTA, microL of 
10% SDS 1 and 1 microL of 10 microG/microL Proteinase K. 

4) Incubate at 45t^ for 15min. 

5) Extract once with 1:1 Tris-equilibrated phenol :chloroform 
aqueous phase. After phenol -chloroform and chloroform, and back extract i 



on. 



6) 



Removal the smallest cDNA fragment with a G-50 spun-column 



(Size exclusion). 



7) 



precipitate with isopropanol by adding 5 microG of glycoge 



n as carrier. 



100 microL sample 
67 microL 7. 5M NH4OAC 
5 microL glycogen 
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180 microL Isopropanol 

8) Spin for 20 min at 4 °C. 

9) Wash twice with 80% or 70% ethanol, centrifuge and remove 
the ethanol. 

[0 0 7 6] 

12. Cleavage of cDNA with anchoring enzyme 

1) Re-suspend the sample in 5 microL of LoTE. Add then in ord 
er: 

LoTE X microL 

10X EcoRI restriction buffer 5 microL 

EcoRI Y microL (use 20 Units of EcoRI) 

Bring up the volume to a total of 50 microL. 

2) Incubate at 37°C for lhour. 

3) Add 1 microL of 0. 5M EDTA, ImicroL of 10% SDS 1 and 1 micr 
oL of 10 microG/microL Proteinase K 10%. 

4) Incubate at 45 V for 15min. 

5) Extract once with 1:1 Tris-equilibrated phenol :chloroform 
aqueous phase. After phenol-chloroform and chloroform, and back extract i 
on 

6) precipitate with isopropanol by adding 5 microG of glycoge 
n as carrier. 

100 microL sample 
67 microL 7. 5M NH40Ac 
5 microL glycogen 
180 microL Isopropanol 

8) Spin for 20 min at 4*C. 

9) Wash twice with 80% or 70% ethanol, centrifuge and removed 
the ethanol wash each time. 

[0 0 7 7] 
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11. Ligation of di-tags to form concatemers 

1) Resuspended LoTE 5 microL. 

2) Added TaKaRa ligation kit II solution II 5 microL and solu 
tion II 10 microL. 

3) Incubate 1.5 hours at 16 'C. 

4) Added 0.5M EDTA 1 microL, 10% SDS 1 microL, 10 microG/micr 
oL Proteinase K 1 microL. 

5) Incubate at 45*C for 15min. 

6) Extract once with 1:1 Tris-equilibrated phenol: chloroform 
aqueous phase. After phenol-chloroform and chloroform, and back extract i 
on. 

7) precipitate with isopropanol by adding 5 microG of glycoge 



n as carrier. 
100 microL sample 
67 microL 7. 5M NH40Ac 
5 microL glycogen 
180 microL Isopropanol 

8) Spin for 20min at 4*C. 

9) Wash twice with 80% or 70% ethanol, centrifuge and removed 

Resolved 5 microL ddH20. 

[0 0 7 8] 
Example 2: 

The above-obtained concatamers are to be further ligated into a cloning 
vector such as pBlueascript II KS+ (Stratagene) . A large variety of clon 
ing vectors are known in the filed, which can be use for invention. 
Standard Ligation: 

Mix a three time excess of concatamer DNA and 100 ng of an appropriate v 
ector linearized with Eco RI in a volume of 5 microL. Then mix 5 microL 
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of Solution I of DNA Ligation Kit Ver.2 (Takara) to the insert/vector mi 
xture. Incubate the tube at 16<>C for 12-16 h. 

[0 0 7 9] 
Transformation: 

To remove salt from the ligation solution, precipitate DNA after the a 
ddition of 2 microG of Glycogen (Roche), 20mM Sodium Chloride and 80% et 
hanol. The DNA pellet is washed twice with 150 microL of 80% of ethanol, 

and the pellet is then dissolved in 10 microL of water. Using 1 microL 
of desalted ligation solution, ElectroMAX™ DH10B™ Cells (Invitrogen) a 
re transformed using Cell-Porator or alike (Biometrer according to the 
transformation procedures described in the manufacturer's manual. Transf 
ormed bacteria are plated on a selective medium and grown overnight. Pos 
itive clones are to be isolated from those plates for further character i 
zation of the concatamers. 

[0 0 8 0] 
Example 3: Sequencing of concatemers 

Sequencing of concatamers is performed using primers nested in the fla 
nking regions of the cloning vector and a BigDye Terminator Cycle Sequen 
cing Ready Reaction Kit v2.0 (Applied Biosystems) and an ABI3700 (Applie 
d Biosystems) sequencer according to the manufacture's product descripti 
ons. The concatamers are sequenced from both ends to cover their entire 
sequence. 

[0 0 8 1] 

Example 4: Identification of 5' -end sequence tags 

The sequences obtained form Concatamers are characterized by the struc 
ture of the di-tags as presented in Figure 5. Defined regions holding th 
e recognition sites for the restriction enzymes used during the cloning 
steps flank each 5' end specific sequence tag. Therefore the 5' end spec 
ific sequence tags can be identified by a manual sequence analysis or by 
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an automated process using an appropriate computer program. Individual 
5' end specific sequence tags can be stored in a computer file or a data 
base system. 

[0 0 8 2] 

Example 5: Characterization of 5' -end sequence tags 

5' end specific sequence tags can be analyzed for their identity by st 
andard software solutions to perform sequence alignments like NCBI BLAST 

(http://www.ncbi.nlm.nih.gov/BUST/), FASTA, available in the Genetics 
Computer Group (GCG) package from Accel rys Inc. (http://www.accelrys.com 
/) or alike. Such software solutions allow for an alignment of 5' end sp 
ecific sequence tags among one another to identify unique or non-redunda 
nt tags, which can be further used in 
Database searches 

Building a 5' -end sequence database 
Gene identification using a 5' -end sequence database 
An example of a BLAST search in GenBank using a 5' end specific tag is g 
iven below: The 16 bp tag (5*-ACC TCC CTC CGC GGA G) is derived from the 
5' end of Human TGF-bl: JBC 264 (1989) 402-408. 
Query= (16 letters) (ACCTCCCTCCGCGGAG) - 

Database: All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS, 
GSS, or phase 0, 1 or 2 HTGS sequences) 

1,205,903 sequences; 5,297,768,116 total letters 



Score E 

Sequences producing significant alignments: (bits) 
Value 

gi 1 10863872 1 ref I NM_000660. 1 1 Homo sapiens transforming grow. . . 32 
1.1 
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gi 1 18590091 1 ref IXM_085882. 1 1 Homo sapiens simi lar to transf . . . 32 
1.1 

gi 1 11424057 1 ref I XM_008912. 1 1 Homo sapiens transforming grow. . . 32 
1.1 

gi|7684381|gblAC011462.4|AC011462 Homo sapiens chromosome 1. . . 32 
1.1 

gi 1 15027087 1 emb I AL389894. 4 ILMFLCHR4A Leishmania major Fried. . . 32 
1.1 

gi 1 1943914 |gb|U70540. 1 ILMU70540 Leishmania mexicana amazone. . . 32 
1.1 

gi ! 37097 1 emb ! X05839. 1 1 HSTGFBG1 Human transforming growth fa. . . 32 
1.1 

gi 1 37092 1 emb I X02812. 1 IHSTGFB1 Human mRNA for transforming g. . . 32 " 
1.1 

gi|340526lgblJ04431.1|HUMTGFBlPR Homo sapiens transforming .. . 32 
1.1 

gi 1 18858696 1 ref |NM_131728. II Danio rerio forkhead box Cla (. . . 30 
4.2 

gi 1 12004937 1 gb I AF219949. 1 1 AF219949 Danio rerio forkhead tra. . . 30 
4.2 

gi 1 193604|gb|M13366. 1 IMUSGPDX Mouse glycerophosphate dehydr. . . 30 
4.2 

gi 1 193601 1 gb I M25558. 1 IMUSGPD Mouse glycerol -3-phosphate deh. . . 30 
4.2 

gi 1 63465 1 emb I V00414. 1 1 GGHI01 Gal lus gal lus mRNA coding for ... 30 
4.2 

gi 1 63444 1 emb I X13894. 1 1 GGH2AF Chicken histone H2A. F gene 30 
4.2 

Al ignments 

£tliE#2 003-3072722 



2002-235294 ^ ^- v I 51/ 

>gi 1 10863872 1 ref INM.000660. II Homo sapiens transforming growth factor, b 
eta 1 

(Camurati-Engelmann disease) (TGFB1) , mRNA 
Length = 2745 

Score = 32.2 bits (16), Expect -1.1 
Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 

I llllll Mill III I 
Sbjct: 1 acctccctccgcggag 16 

>gi 1 18590091 1 ref I XM_085882. 1 1 Homo sapiens similar to transforming grow 
th factor, beta 1 (H. 

sapiens) (L0C147760), mRNA 

Length = 697 

Score = 32.2 bits (16), Expect =1.1 
Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 

lllllllllillllll 
Sbjct: 7 acctccctccgcggag 22 

>gi 1 11424057 1 ref I XM_008912. 1 1 Homo sapiens transforming growth factor, 
beta 1 (TGFB1) , mRNA 

Length = 2741 
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Score = 32.2 bits (16), Expect =• 1.1 
Identities = 16/16 (100%) 



Strand = Plus / Plus 



Query: 1 acctccctccgcggag 16 
llllllllllllllll 



Sbjct: 1 acctccctccgcggag 16 

Database: All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS, GSS, 
or phase 0, 1 or 2 HTGS sequences) 

Posted date: Apr 9, 2002 10:59 AM 
Number of letters in database: 1,002,800,820 
Number of sequences in database: 1,205,903 
Lambda K H 

1.37 0.711 1.31 



1.37 0.711 1.31 
Matrix: blastn matrix: 1 -3 
Gap Penalties: Existence: 5, Extension: 2 
Number of Hits to DB: 6901 
Number of Sequences: 1205903 
Number of extensions: 6901 
Number of successful extensions: 1479 
Number of sequences better than 10.0: 16 
length of query: 16 
length of database: 5,297,768,116 
effective HSP length: 15 



Gapped 



Lambda 



K 



H 
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effective length of query: 1 

effective length of database: 5,279,679,571 

effective search space: 5279679571 

effective search space used: 5279679571 

T: 0 

A: 30 

XI: 6 (11.9 bits) 
12: 15 (29.7 bits) 
Si: 12 (24.3 bits) 
S2: 15 (30.2 bits) 



Top of Form 

1: NMJD00660. Homo sapensRelated Sequences, C8VQM Protein, PuhTVfed, 
tran...|gi:10863872] Taxonoiny, TMSTS, IinkOut 

LOCUS NM_000660 2745 bp mRNA linear PRI 13-F 

EB-2002 

DEFINITION Homo sapiens transforming growth factor, beta 1 (Camurati-En 
gelmann 

disease) (TGFB1), mRNA. 
ACCESSION NM_000660 
VERSION NM_000660.1 GI: 10863872 
KEYWORDS 

SOURCE human. 
ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos 

tomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 (bases 1 to 2745) 
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AUTHORS Derynck.R., Jarrett, J. A. , Chen.E.Y. , Eaton, D.H., Bell,J.R., 

Assoian,R.K. , Roberts, A. B. , Sporn,M.B. and Goeddel,D.V. 
TITLE Human transforming growth factor-beta complementary DNA sequ 
ence 

and expression in normal and transformed cells 

JOURNAL Nature 316 (6030), 701-705 (1985) 

MEDLINE 85296301 
REFERENCE 2 (bases 1 to 2745) 

AUTHORS Sporn,M.B. , Roberts, A. B. , Wakef ield,L.M. and Assoian.R.K. 

TITLE Transforming growth factor-beta: biological function and che 
mical 

structure 

JOURNAL Science 233 (4763), 532-534 (1986) 
MEDLINE 86261803 
PUBMED 3487831 
REFERENCE 3 (bases 1 to 2745) 
AUTHORS Chang, N.S., MattisonJ., Cao,H., Pratt, N., Zhao,Y. and Lee, C 

TITLE Cloning and characterization of a novel transforming growth 
factor-betal- induced TIAF1 protein that inhibits tumor necro 

sis 

factor cytotoxicity 
JOURNAL Biochem. Biophys. Res. Commun. 253 (3), 743-749 (1998) 
MEDLINE 99119079 
PUBMED 9918798 
REFERENCE 4 (bases 1 to 2745) 
AUTHORS Ghadami.M., Makita.Y., Yoshida.K., Nishimura.G. , Fukushima,Y 

• 9 

Wakui.K. , Ikegawa,S., Yamada.K. , Kondo,S., Niikawa,N. and To 
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mita, H. 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 

to 

JOURNAL 
MEDLINE 
PUBMED 

REFERENCE 
AUTHORS 
TITLE 

growth 

JOURNAL 
MEDLINE 
PUBMED 

COMMENT 

final 

1. 

FEATURES 



Genetic mapping of the Camurati-Engelmann disease locus to 

chromosome 19ql3. l-ql3. 3 

Am. J. Hum. Genet. 66 (1), 143-147 (2000) 

20100617 

10631145 

5 (bases 1 to 2745) 

Vaughn, S. P., Broussard, S. , Hall,C.R., Scott, A., Blanton.S.H. 
Milunsky, J.M. and Hecht, J.T. 

Confirmation of the mapping of the Camurati-Englemann locus 

19ql3. 2 and refinement to a 3.2-cM region 
Genomics 66 (l), 119-121 (2000) 
20304762 
10843814 

6 (bases 1 to 2745) 

Lim,J.M., Kim.J.A. , Lee.J.H. and Joo.C.K. 

Downregulated expression of integrin alpha6 by transforming 

factor-beta (1) on lens epithelial cells in vitro 
Biochem. Biophys. Res. Commun. 284 (1), 33-41 (2001) 
21268957 
11374867 

PROVISIONAL REFSEQ: This record has not yet been subject to 
NCBI review. The reference sequence was derived from X02812. 
Location/Qualifiers 
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source 



gene 



misc feature 



variation 



variation 



CDS 



al 



e)' 



1..2745 

/organ ism="Homo sapiens" 
/db.xref =" taxon : 9606" 
/chromosome=" 19" 
/map="19ql3.1" 
1..2745 
/gene="TGFBl" 
/note="TGFB; DPD1; CED" 
/db_xref ="LocusID : 7040" 
/db_xref="MIM: 190180" 
37. . 113 

/note="pot. hairpin loops-forming region" 
72 

/allele="-" 
/allele="C" 

/db_xref="dbSNP: 1800999" 
79 

/allele="-" 
/allele="C" 

/db_xref="dbSNP: 1799753" 
842. . 2017 
/gene="TGFBl" 

/note=" transforming growth factor, beta 1; diaphyse 

dysplasia 1, progressive (Camurati-Engelmann diseas 

/codon_start=l 
/db_xref ="LocusID : 7040" 
/db_xref="MIM: 190180" 
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/product =" transforming growth factor, beta 1 
(Camurat i-Engelmann disease) " 
/protein_id="NP_000651. 1" 
/db_xref="GI: 10863873" 

/translat i on='WPSGLRLLPLLLPLLWLLVLTPGPPAAGLSTCKTID 

MELVKRK 

RIEAIRGQII^KLRI^SPPSQGEWPGPIPEAVIJiLYNSTRDRVAGESAEPEPEPEAD 

YYAKEmVIJilVETHNEIYDKre 
UtLKVEQHVELYQKYSNNSWRYI^ 
RLSAHCSCDSRDNTLQVDINGFTTG^ 
HRRALDTNYCFSSTEKNCCWQLYIDF^ 
TQYSKVIJ&YNQHNPGASAAPCCWQA^^ 
misc_feature 863. .910 

/note="pot. core sequence of signal peptide (aa -27 

2 to 

-257)" 
variation 870 

/allele="C" 
/allele="r 

/db_xref ="dbSNP: 1982073" 
variation 915 

/allele="C" 
/allele="G" 

/db_xref="dbSNP: 1800471" 

misc_f eature 938. . 1600 

/note="TGFb_propeptide; Region: TGF-beta propeptide 

mi sc_f eature 953 

/note="pot. altern. translation start site" 
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misc_feature 1035. . 1043 

/note="put. glycosylation site" 
misc.feature 1247. . 1255 

/note="put. glycosylation site" 
misc_feature 1370. . 1378 

/note="put. glycosylation site" 
variation 1632 

/allele="C" 

/allele="T" 

/db_xref="dbSNP: 1800472" 
mat_peptide 1679. .2014 

/product="mature TGF-beta (aa 1-112)" 
misc_feature 1715. .2014 

/note="TGF-beta; Region: Transforming growth factor 



beta 

like domain" 
misc_feature 1721. . 2014 

/note="TGFB; Region: Transforming growth factor-bet 

a 

(TGF-beta) family" 
misc_feature 2018.. 2096 

/note="GC-rich region" 
promoter 2097. . 2103 

/note="TATA-box-l ike region" 
misc_feature 2517. .2522 

/note="put. polyadenylation signal" 

polyA_site 2539 

/note="polyadenylation site" 

BASE COUNT ' 527 a 938 c 801 g 479 t 
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ORIGIN 

1 ar.r.tc.r.ctcc gcggagcagc cagacagcga gggccccggc cgggggcagg ggggacg 

ccc 

61 cgtccggggc accccccccg gctctgagcc gcccgcgggg ccggcctcgg cccggag 

egg 

121 aggaaggagt cgecgaggag cagectgagg ccccagagtc tgagacgagc cgccgcc 

gcc 

181 cccgccactg eggggaggag ggggaggagg agegggagga gggacgagct ggtcggg 

aga 

241 agaggaaaaa aacttttgag acttttccgt tgccgctggg ageeggagge gcgggga 

cct 

301 ettggegega cgctgccccg cgaggaggca ggacttgggg accccagacc gcctccc 

ttt 

361 gccgccgggg acgcttgctc cctccctgcc ccctacacgg cgtccctcag gcgcccc 

cat 

421 tccggaccag ccctcgggag tcgccgaccc ggcctcccgc aaagactttt ccccaga 

cct 

481 cgggcgcacc ccctgcacgc cgccttcatc cccggcctgt ctcctgagcc cccgcgc 

ate 

541 ctagaccctt tctcctccag gagaeggate tctctccgac ctgccacaga tccccta 

ttc 

601 aagaccaccc accttctggt accagatcgc gcccatctag gttatttccg tgggata 

ctg 

661 agacaccccc ggtccaagcc tcccctccac cactgcgccc ttctccctga ggagect 

cag 

721 ctttccctcg aggccctcct accttttgcc gggagacccc cagcccctgc aggggcg 

ggg 

781 cctccccacc acaccagccc tgttcgcgct cteggcagtg ccggggggcg ccgcctc 

ccc 

ffiSE*i$2 003-3072722 
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841 catgccgccc tccgggctgc ggctgctgcc gctgctgcta ccgctgctgt ggctact 

ggt 

901 gctgacgcct ggcccgccgg ccgcgggact atccacctgc aagactatcg acatgga 

get 

961 ggtgaagcgg aagegcateg aggccatccg cggccagatc ctgtccaagc tgegget 

cgc 

1021 cagccccccg agecaggggg aggtgccgcc cggcccgctg cccgaggccg tgetege 

cct 

1081 gtacaacagc acccgcgacc gggtggccgg ggagagtgca gaaceggage ccgagcc 

tga 

1141 ggccgactac tacgecaagg aggtcacccg cgtgctaatg gtggaaaccc acaacga 

aat 

1201 ctatgacaag ttcaagcaga gtacacacag catatatatg ttcttcaaca catcaga 

get 

1261 ccgagaagcg gtacctgaac ccgtgttgct ctcccgggca gagctgegtc tgctgag 

gag 

1321 gctcaagtta aaagtggagc agcacgtgga gctgtaccag aaatacagca acaattc 

ctg 

1381 gcgatacctc agcaaccggc tgctggcacc cagcgactcg ccagagtggt tatcttt 

tga 

1441 tgtcaccgga gttgtgcggc agtggttgag ccgtggaggg gaaattgagg get t teg 

cct 

1501 tagcgcccac tgctcctgtg acagcaggga taacacactg caagtggaca teaaegg 

gtt 

1561 cactaccggc cgccgaggtg acctggccac cattcatggc atgaacegge ctttcct 

get 

1621 tctcatggcc accccgctgg agagggecca geatctgeaa agctcccggc accgccg 

age 

1681 cctggacacc aactattget tcagctccac ggagaagaac tgctgcgtgc ggcagct 
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gta 

1741 cattgacttc cgcaaggacc tcggctggaa gtggatccac gagcccaagg gctacca 

tgc 

1801 caacttctgc ctcgggccct gcccctacat ttggagcctg gacacgcagt acagcaa 

ggt 

1861 cctggccctg tacaaccagc ataacccggg cgcctcggcg gcgccgtgct gcgtgcc 

gca 

1921 ggcgctggag ccgctgccca tcgtgtacta cgtgggccgc aagcccaagg tggagca 

get 

1981 gtccaacatg ategtgeget ectgeaagtg cagctgaggt cccgccccgc cccgccc 

cgc 

2041 cccggcaggc ccggccccac cccgccccgc ccccgctgcc ttgcccatgg gggctgt 

att 

2101 taaggacacc gtgccccaag cccacctggg gccccattaa agatggagag aggactg 

egg 

2161 atctctgtgt cattgggege ctgcctgggg tctccatccc tgacgttccc ccactcc 

cac 

2221 tccctctctc tccctctctg cctcctcctg cctgtctgca ctattccttt gcccggc 

ate 

2281 aaggcacagg ggaccagtgg ggaacactac tgtagttaga tctatttatt gagcacc 

ttg 

2341 ggcactgttg aagtgcctta cattaatgaa ctcattcagt caccatagca acactct 

gag 

2401 atggcaggga ctctgataac acccatttta aaggttgagg aaacaagccc agagagg 

tta 

2461 agggaggagt tcctgcccac caggaacctg ctttagtggg ggatagtgaa gaagaca 

at a 

2521 aaagatagta gttcaggeca ggcggggtgc tcacgcctgt aatcctagca cttttgg 

gag 
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2581 gcagagatgg gaggatactt gaatccaggc atttgagacc agcctgggta acatagt 

gag 

2641 accctatctc tacaaaacac ttttaaaaaa tgtacacctg tggtcccagc tactctg 

gag 

2701 gctaaggtgg gaggatcact tgatcctggg aggtcaaggc tgcag 

// 

Bottom of Form 

Revised: October 24, 2001. 

Query= (16 letters) 
Database: GenBank Human EST entries 
4,280,058 sequences; 2,114,234,064 total letters 

Score E 

Sequences producing significant alignments: (bits) 
Value 

gi|19365764|gb|BM915385.1IBM915385 AGENC0URT_6701642 NIH_MG. . . 32 
0.41 

gi|19353768lgb|BM903897.1|BM903897 AGENCOURT_6696012 NIHJ1G. . . 32 
0.41 

gi 1 18807810 1 gb IBM562052. 1 IBM562052 AGENC0URT_6562015 NIHJflG. . . 32 
0.41 

gi|18791603lgblBM553137.1IBM553137 AGENC0URT_6572574 NIHJ1G... 32 
0.41 

gi 1 16171065 1 gb IBI908151. 1 IBI908151 603067456F1 NIH_MGC_118 ... 32 
0.41 

gi 1 15759271 1 gb IBI767693. 1 IBI767693 603060648F1 NIH_MGC_122 ... 32 
0.41 

gi 1 15343643 1 gb IBI518851. 1 IBI518851 603061760F1 NIH_MGC_118 ... 32 
0.41 
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gi 1 14309343 1 gb I BG899094. 1 1 BG899094 H0A21-1-G9 HOA (Human Os. . . 32 
0.41 

gi|13662542|gb!BG611171.1IBG611171 602612144F1 NIH_MGC_60 H. : . 32 
0.41 

gi 1 12609210 1 gb IBG115704. 1 IBG115704 602317174F1 NIH_MGC_88 H. . . 32 
0.41 

gi|12101282lgb|BF796228.1|BF796228 602258513F1 NIH_MGC_85 H. . . 32 
0.41 

gi 1 11152079 1 gb IBF238160. 1 IBF238160 601811886F1 NIH_MGC_48 H. . . 32 
0.41 

gi 1 11100313 1 gb IBF206727. 1 IBF206727 601871105F1 NIH_MGC_19 H. . . 32 
0.41 

gi 1 11100272 1 gb IBF206686. 1 IBF206686 601871051F1 NIHJ4GC_19 H. . . 32 
0.41 

gi 1 16775383 1 gb IBM046103. 1 IBM046103 603625849F1 NIH_MGC_40 H. . . 30 
1.6 

gi 1 19739174 1 gb IBQ014273. 1 IBQ014273 UI-H-EDl-axs-h-21-0-UI. s. . . 28 
6.4 

gi 1 19378603 1 gb IBM928224. 1 1 BM928224 AGENC0URT_6699855 NIH_MG. . . 28 
6.4 

gi|19367808lgblBM917429.1]BM917429 AGENC0URT_6606724 NIH_MG. . . 28 
6.4 

gi 1 19364214 1 gb IBM913835. 1 IBM913835 AGENC0URT_66 12786 NIHJAG. . . 28 
6.4 

gi 1 19361343 1 gb I BM910964. 1 1 BM910964 AGENC0URT_6615957 NIH.MG. . . 28 
6.4 

gi 1 18505954 1 gb I BM456914. 1 1 BM456914 AGENC0URT_6404253 NIH_MG. . . 28 
6.4 

gi 1 18499709 1 gb I BM450669 . 1 IBM450669 AGENC0URT.6394717 NIHJJG. . . 28 
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6.4 

gi ] 16000196 1 gb IBI859449. 1 IBI859449 603388188F1 NIH_MGC_87 H. . . 28 
6.4 

gi 1 15928460 1 gb IBI818193. 1 IBI818193 603032663F1 NIH_MGC_115 ... 28 
6.4 

gi 1 15431547 1 gb IBI544235. 1 IBI544235 603241605F1 NIH_MGC_95 H. . . 28 
6.4 

gi 1 15345229 |gblBI520437. 1 1 BI520437 603071622F1 NIH_MGC_119 . . . 28 
6.4 

gi 1 14440373 1 gb I BI033747. 1 1 BI033747 PM3-NN0223-220201-014-h0. . . 28 
6.4 

gi 1 14426676 1 gb IBI020046. 1 1 BI020046 CM3-MT0291-110101-622-f 0. . . 28 
6.4 

gi 1 14081325 1 gb IBG770672. 1 1 BG770672 602734012F1 NIH_MGC_49 H. . . 28 
6.4 

gi 1 13546630 ] gb IBG547965. 1 1 BG547965 602576071F1 NIH_MGC_77 H. . . 28 
6.4 

gi 1 13030375 1 gb I BG281450. 1 1 BG281450 602401966F1 NIH_MGC_20 H. . . 28 
6.4 

gi 1 12951460 1 emb I AL582959. 1 1 AL582959 AL582959 LTI_NFL010_BC2. . . 28 
6.4 

gi 1 12764352 1 gb I BG254536. 1 1 BG254536 602368464F1 NIH_MGC_91 H. . . 28 
6.4 

gi 1 12378592 lgb|BF961317. 1 IBF961317 PM3-NN0223-111200-004-d0. . . 28 
6.4 

gi 1 12374538 1 gb I BF957263. 1 1 BF957263 PM3-NN0223-241100-002-b0. . . 28 
6.4 

gi 1 12323114 1 gb I BF926150. 1 1 BF926150 CM2-NT0193-301100-562-al. . . 28 
6.4 
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gi 1 12259862 1 gb|BF869732. 1IBF869732 IL3-ET0114-251000-316-A1. . . 28 
6.4 

gi 1 12129894 1 gb IBF800905. 1 IBF800905 PMl-CI0110-201000-003-f 0. . . 28 
6.4 

gi 1 12071436 1 gb IBF744760. 1 IBF744760 QV2-BT0635-311000-440-cl. . . 28 
6.4 

gi 1 11770407 1 gb IBE965733. 2 IBE965733 601659792R1 NIH_MGC_70 H. . . 28 
6.4 

gi 1 11766539 1 gb IBE963121. 2 IBE963121 601656923R1 NIH_MGC_67 H. . . 28 
6.4 

gi|10348536lgb|BE890328.1|BE890328 601431783F1 NIH_MGC_72 H. . . 28 
6.4 

gi 1 10142985 1 gb IBE728993. 1 IBE728993 601562251F1 NIH_MGC_20 H. . . 28 
6.4 

gi 1 10095527 1 gb IBE707262. 1 1 BE707262 PMl-HT0452-060700-008-eO. . . 28 
6.4 

gi 1 9772196 1 gb I BE543551 . 1 1 BE543551 601070523F1 NIH_MGC_12 Ho. . . 28 
6.4 

gi 1 9768571 1 gb IBE539926. 1 IBE539926 601060667F2 NIH_MGC_10 Ho. . . 28 
6.4 

gi 1 9342607 1 gb I BE397242. 1 1 BE397242 601290754F1 NIH_MGC_8 Horn. . . 28 
6.4 

gi |9332870lgb|BE387505. 1 IBE387505 601274247F1 NIH_MGC_20 Ho. . . 28 
6.4 

gi|8140649lgb|AW950985.1IAW950985 EST363055 MAGE resequence. . . 28 
6.4 

gi 1 8139665 1 gb I AW950129. 1 1 AW950129 EST362094 MAGE resequence. . . 28 
6.4 

gi 1 6879658 1 gb I AW375004. 1 1 AW375004 MR0-CT0068-280999-002-f 07. . . 28 
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6.4 

gi 1 5435227 1 emb I AL079651 . 1 1 AL079651 DKFZp434N0629_rl 434 (sy. . . 28 
6.4 

gi|5406349lemblAL036861.1IAL036861 DKFZp56401963_rl 564 (sy... 28 
6.4 

gi |2566893lgb|AA641675. 1 IAA641675 nr62g01. si NCI_CGAP_Lym3 ... 28 
6.4 

gi 1 2080087 1 gb I AA418268. 1 1 AA418268 zv96d09. si Soares_NhHMPu_. . . 28 
6.4 

gi 1 2056455 ] gb I AA402650. 1 1 AA402650 zu49g06. rl Soares ovary t . . . 28 
6.4 

gill516398lgb|AA040102.1|AA040102 zk46e02.rl Soares_pregnan. . . 28 
6.4 

Al ignments 

>gi|19365764lgblBM915385.1|BM915385 AGENC0URT_6701642 NIH_MGC_41 Homo s 

apiens cDNA clone 

IMAGE: 5481560 5'. 
Length = 1086 

Score = 32.2 bits (16), Expect = 0.41 
Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 

I III MINIMI IN 
Sbjct: 23 acctccctccgcggag 38 

>gi 1 19353768 1 gb IBM903897. 1 IBM903897 AGENC0URT_6696012 NIH_MGC_67 Homo s 
apiens cDNA clone IMAGE:5492392 
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5'. 

Length = 1497 

Score = 32.2 bits (16), Expect = 0.41 
Identities = 16/16 (100%) 
Strand = Plus / Minus 

Query: 1 acctccctccgcggag 16 

I MM I Mill II III 
Sbjct: 445 acctccctccgcggag 430 

>gi 1 18807810 1 gb ! BM562052. 1 1 BM562052 AGENC0URT_6562015 NIH_MGC_118 Homo 
sapiens cDNA clone 

IMAGE: 5745414 5'. 

Length = 1175 

Score = 32.2 bits (16), Expect = 0.41 
Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
Sbjct: 20 acctccctccgcggag 35 

>gi 1 18791603 1 gb I BM553137. 1 IBM553137 AGENC0URT.6572574 NIH_MGC_41 Homo s 
apiens cDNA clone 

IMAGE: 5467063 5'. 

Length = 1100 
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Score = 32.2 bits (16), Expect = 0.41 
Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 

IIIIIIIIMIIIIII 
Sbjct: 26 acctccctccgcggag 41 

>gi 1 16171065 1 gb IBI908151. 1 IBI908151 603067456F1 NIH_MGC_118 Homo sapien 
s cDNA clone IMAGE:5216508 5'. 
Length = 706 

Score = 32.2 bits (16), Expect =0.41 
Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 

lllllllllllllll! 
Sbjct: 25 acctccctccgcggag 40 

>gi 1 15759271 1 gb I BI767693. 1 1 BI767693 603060648F1 NIH_MGC_122 Homo sap ien 
s cDNA clone IMAGE: 5209978 5*. 
Length = 862 

Score = 32.2 bits (16), Expect = 0.41 
Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 
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llllilllllllllll 
Sbjct: 705 acctccctccgcggag 720 

>gi 1 15343643IgblBI518851. 1 IBI518851 503061760F1 NIH_MGC_118 Homo sapien 
s cDNA clone IMAGE: 5210943 5'. 
Length = 943 

Score = 32.2 bits (16), Expect = 0.41 
Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 

llllilllllllllll 
Sbjct: 25 acctccctccgcggag 40 

>gi 1 14309343 1 gb IBG899094. 1 IBG899094 H0A21-1-G9 HOA (Human Osteoarthr it i 
c Cartilage) Homo sapiens 
cDNA. 

Length = 364 

Score = 32.2 bits (16), Expect = 0.41 
Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 

llllilllllllllll 
Sbjct: 83 acctccctccgcggag 98 

>gi 1 13662542 1 gb IBG611171. 1 IBG611171 602612144F1 NIH_MGC_60 Homo sapiens 
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cDNA clone IMAGE: 4737466 5'. 
Length = 897 

Score = 32.2 bits (16), Expect = 0.41 
Identities = 16/16 (100%) 
Strand = Plus / Minus 

Query: 1 acctccctccgcggag 16 

MM III MINIMI 
Sbjct: 809 acctccctccgcggag 794 

>gi 1 12609210 1 gb|BG115704.1|BG115704 602317174F1 NIH_MGC_88 Homo sapiens 
cDNA clone IMAGE: 4417482 5' . 
Length = 838 

Score = 32.2 bits (16), Expect = 0.41 
Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 

I II I II III II I II I I 
Sbjct: 51 acctccctccgcggag 66 

>gi 1 12101282 1 gb I BF796228. 1 1 BF796228 602258513F1 NIH_MGC_85 Homo sap iens 
cDNA clone IMAGE: 4341962 5'. 
Length = 1081 

Score = 32.2 bits (16), Expect = 0.41 
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Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 

II II IIMIII Mill 
Sbjct: 7 acctccctccgcggag 22 

>gi|11152079lgb|BF238160.1|BF238160 601811886F1 NIH_MGC_48 Homo sapiens 
cDNA clone IMAGE:4054821 5'. 
Length = 811 

Score = 32.2 bits (16), Expect = 0.41 
Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 

II II MINIM II 1 1 
Sbjct: 11 acctccctccgcggag 26 

>gi|11100313lgb|BF206727.1|BF206727 601871105F1 NIH_MGC_19 Homo sapiens 
cDNA clone IMAGE:4101600 5'. 
Length = 888 

Score = 32.2 bits (16), Expect = 0.41 
Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 
1 1 1 ] 1 1 1 1 1 1 1 1 1 1 1 1 



fflfE# 2003-3072722 



12 0 0 2-2 3 5 2 9 4 ^ ^-z? I 72/ 

Sbjct: 32 acctccctccgcggag 47 

>gi|11100272|gblBF206686.1|BF206686 601871051F1 NIH_MGC_19 Homo sapiens 
cDNA clone IMAGE:4101517 5'. 
Length = 917 

Score = 32.2 bits (16), Expect = 0.41 
Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 

Mil IMMIIMII I 
Sbjct: 33 acctccctccgcggag 48 

>gi|16775383lgb|BM046103.1IBM046103 603625849F1 NIH_MGC_40 Homo sapiens 
cDNA clone IMAGE:5452309 5'. 
Length = 869 

Score = 30.2 bits (15), Expect =1.6 
Identities = 15/15 (100%) 
Strand = Plus / Plus 

Query: 2 cctccctccgcggag 16 

II I llllll II MM 
Sbjct: 692 cctccctccgcggag 706 

>gi 1 19739174 1 gb IBQ014273. 1 1 BQ014273 UI-H-EDl-axs-h-21-0-UI. si NCI_CGAP_ 
EDI Homo sapiens cDNA clone 
IMAGE: 5833028 3'. 
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Length = 772 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Minus 

Query: 2 cctccctccgcgga 15 

Ml I II III I INI 
Sbjct: 495 cctccctccgcgga 482 

>g i 1 19378603 1 gb I BM928224. 1 1 BM928224 AGENC0URT_6699855 NIH_MGC_121 Homo 
sapiens cDNA clone IMAGE: 5770072 5'. 
Length = 1140 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 2 cctccctccgcgga 15 

II 1 1 1 1 1 1 1 1 1 1 1 1 
Sbjct: 1009 cctccctccgcgga 1022 

>gi 1 19367808 1 gb I BM917429. 1 1 BM917429 AGENC0URT_6606724 NIHJ1GC_106 Homo 
sapiens cDNA clone IMAGE: 5483947 5'. 
Length = 1073 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 
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Query: 1 acctccctccgcgg 14 

IIIIIIII1MI II 
Sbjct: 916 acctccctccgcgg 929 

>gi 1 19364214 1 gb IBM913835. 1 1 BM913835 AGENC0URT.66 12786 NIH_MGC_98 Homo s 
apiens cDNA clone IMAGE:5477539 5'. 
Length = 1104 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Minus 

Query: 2 cctccctccgcgga 15 

llllllllllllll 
Sbjct: 842 cctccctccgcgga 829 

>gi 1 19361343 1 gb IBM910964. 1 1 BM910964 AGENC0URT_6615957 NIH_MGC_98 Homo s 
apiens cDNA clone IMAGE:5454547 5*. 
Length = 1128 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Minus 

Query : 3 ctccctccgcggag 16 

llllllllllllll 
Sbjct: 883 ctccctccgcggag 870 
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>gi 1 18505954 1 gb|BM456914. 1 IBM456914 AGENC0URT_6404253 NIH_MGC_92 Homo s 
apiens cDNA clone 

IMAGEI5583862 5'. 

Length = 1813 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 2 cctccctccgcgga 15 

llllllllllllll 
Sbjct: 29 cctccctccgcgga 42 

>gi 1 18499709 1 gb IBM450669. 1 IBM450669 AGENC0URT.6394717 NIH_MGC_67 Homo s 
apiens cDNA clone IMAGE:5494366 5'. 
Length = 1430 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcgg 14 

llllllllllllll 
Sbjct: 1150 acctccctccgcgg 1163 

>gi 1 16000196 1 gb I BI859449. 1 IBI859449 603388188F1 NIH_MGC_87 Homo sapiens 
cDNA clone IMAGE: 5396997 5'. 
Length = 852 
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Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcgg 14 

1 1 I 1 1 1 1 1 1 I 1 1 1 I 
Sbjct: 100 acctccctccgcgg 113 

>gi 1 15928460 1 gb I BI818193. 1 IBI818193 603032663F1 NIH_MGC_115 Homo sapien 
s cDNA clone IMAGE:5173838 5'. 
Length = 683 

Score = 28.2 bits (14), Expect - 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 2 cctccctccgcgga 15 

iii milium 

Sbjct: 96 cctccctccgcgga 109 

>gi 1 15431547 1 gb IBI544235. 1 IBI544235 603241605F1 NIH_MGC_95 Homo sapiens 
cDNA clone IMAGE: 5284296 5'. 
Length = 676 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand. = Plus / Minus 

Query: 3 ctccctccgcggag 16 
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llllll IMIIIII 
Sbjct: 39 ctccctccgcggag 26 

>gi 1 15345229 1 gb I BI520437. 1 1 BI520437 603071 622F1 NIH_MGC_119 Homo sap ien 
s cDNA clone IMAGE: 5163773 5'. 
Length = 727 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Minus 

Query: 1 acctccctccgcgg 14 

III II II! II MM 
Sbjct: 505 acctccctccgcgg 492 

>gi 1 14440373 1 gb I BI033747. 1 1 BI033747 PM3-NN0223-220201-014-h04 NN0223 Ho 
mo sapiens cDNA. 

Length = 284 

Score - 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Minus 

Query: 1 acctccctccgcgg 14 

llllllllillll! 
Sbjct: 97 acctccctccgcgg 84 

>gi 1 14426676 1 gb IBI020046. 1 1 BI020046 CM3-MT0291-110101-622-f 04 MT0291 Ho 
mo sapiens cDNA. 
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Length = 436 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Minus 

Query: 1 acctccctccgcgg 14 

Minimum 

Sbjct: 365 acctccctccgcgg 352 

>gi 1 14081325 1 gb IBG770672. 1 IBG770672 602734012F1 NIH_MGC_49 Homo sapiens 
cDNA clone IMAGE :4859546 5'. 
Length = 949 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcgg 14 

II I MINI Mill 
Sbjct: 63 acctccctccgcgg 76 

>gi 1 13546630 1 gblBG547965. 1 IBG547965 602576071F1 NIH_MGC_77 Homo sapiens 
cDNA clone IMAGE :4704209 5'. 
Length = 918 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 
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Query: 1 acctccctccgcgg 14 

III II Ml MUM 
Sbjct: 248 acctccctccgcgg 261 

>gi 1 13030375 1 gb I BG281450. 1 IBG281450 602401966F1 NIH_MGC_20 Homo sapiens 
cDNA clone IMAGE :4544201 5'. 
Length = 782 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcgg 14 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 
Sbjct: 417 acctccctccgcgg 430 

>gi 1 12951460 1 emb I AL582959 . 1 1 AL582959 AL582959 LTI_NFL010_BC2 Homo sap i e 
ns cDNA clone CS0DL0O8YA12 3 prime. 
Length = 822 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Minus 

Query: 2 cctccctccgcgga 15 

MIMMIIIMM 
Sbjct: 533 cctccctccgcgga 520 
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>gi 1 12764352 1 gb IBG254536. 1 IBG254536 602368464F1 NIH_MGC_91 Homo sapien 
s cDNA clone IMAGE :4476902 5'. 
Length = 1031 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 2 cctccctccgcgga 15 

llllllllllllll 
Sbjct: 849 cctccctccgcgga 862 

>gi 1 12378592 1 gb I BF961317. 1 IBF961317 PM3-NN0223-111200-004-d03 NN0223 H 
omo sapiens cDNA. 

Length = 277 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Minus 

Query: 1 acctccctccgcgg 14 

llllllllllllll 
Sbjct: 89 acctccctccgcgg 76 

>gi 1 12374538 1 gb I BF957263. 1 IBF957263 PM3-NN0223-241100-002-b08 NN0223 H 
omo sapiens cDNA. 

Length = 168 

Score » 28.2 bits (14), Expect =6.4 
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Identities = 14/14 (100%) 
Strand = Plus / Minus 

Query: 1 acctccctccgcgg 14 

MIMIIIMII 1 1 
Sbjct: 117 acctccctccgcgg 104 

>gi 1 12323114 1 gb IBF926150. 1 IBF926150 CM2-NT0193-301100-562-al2 NT0193 Ho 
mo sapiens cDNA. 

Length = 417 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Minus 

Query: 2 cctccctccgcgga 15 

MINIM INN I 
Sbjct: 268 cctccctccgcgga 255 

>gi 1 12259862 1 gb I BF869732. 1 1 BF869732 IL3-ET0114-251000-316-A11 ET01 14 Ho 
mo sapiens cDNA. 

Length = 278 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Minus 

Query: 1 acctccctccgcgg 14 
1 1 1 1 1 1 f 1 1 1 1 1 1 1 
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Sbjct: 73 acctccctccgcgg 60 

>gi 1 12129894 1 gb IBF800905. 1 IBF800905 PMl-CI0110-201000-003-f 08 CI0110 Ho 
mo sapiens cDNA. 

Length = 283 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcgg 14 

II MINIMUM 
Sbjct: 211 acctccctccgcgg 224 

>gi 1 12071436 1 gb IBF744760. 1 IBF744760 QV2-BT0635-311000-440-cll BT0635 Ho 
mo sapiens cDNA. 

Length = 534 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 2 cctccctccgcgga 15 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 
Sbjct: 319 cctccctccgcgga 332 

>gi 1 11770407 1 gb IBE965733. 2 IBE965733 601659792R1 NIH_MGC_70 Homo sapiens 
cDNA clone IMAGE: 3896134 3'. 
Length = 1336 
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Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcgg 14 

llllllllllllll 
Sbjct: 292 acctccctccgcgg 305 

>gi 1 11766539 1 gb IBE963121. 2 IBE963121 601656923R1 NIH_MGC_67 Homo sapiens 
cDNA clone IMAGE:3865924 3'. 
Length = 1442 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Minus 

Query: 3 ctccctccgcggag 16 

llllllllllllll 
Sbjct: 403 ctccctccgcggag 390 

>gi 1 10348536 1 gb I BE890328. 1 IBE890328 601431783F1 NIH_MGC_72 Homo sap iens 
cDNA clone IMAGE:3916820 5' . 
Length = 794 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 
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Query: 1 acctccctccgcgg 14 

Mill MINIMI 
Sbjct: 115 acctccctccgcgg 128 

>gi 1 10142985 1 gb I BE728993. 1 1 BE728993 601562251F1 NIH_MGC_20 Homo sap iens 
cDNA clone ' IMAGE: 3831924 5'. 
Length = 840 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcgg 14 

1 1 1 1 1 II 1 1 1 1 1 1 1 
Sbjct: 397 acctccctccgcgg 410 

>gi 1 10095527 1 gb I BE707262. 1 1 BE707262 PMl-HT0452-060700-008-e08 HT0452 Ho 
mo sapiens cDNA. 

Length = 592 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Minus 

Query: 1 acctccctccgcgg 14 

1 1 ! 1 1 ] 1 1 1 1 1 II I 
Sbjct: 343 acctccctccgcgg 330 

>gi 19772196 1 gb IBE543551. 1 IBE543551 601070523F1 NIH_MGC_12 Homo sapiens 
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cDNA clone IMAGE: 3456940 5'. 
Length = 1035 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcgg 14 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 
Sbjct: 332 acctccctccgcgg 345 

>gi 1 9768571 1 gb I BE539926. 1 1 BE539926 601060667F2 NIH_MGC_10 Homo sap iens 
cDNA clone IMAGE:3447161 5'. 
Length = 902 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 




Query: 1 acctccctccgcgg 14 

II I II MM I INI 
Sbjct: 411 acctccctccgcgg 424 



>gi|9342607|gblBE397242.1|BE397242 601290754F1 NIH_MGC_8 Homo sapiens c 
DNA clone IMAGE:3621253 5'. 
Length = 524 
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Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 2 cctccctccgcgga 15 

Illillllllllll 
Sbjct: 228 cctccctccgcgga 241 

>gil9332870lgblBE387505.1IBE387505 601274247F1 NIH_MGC_20 Homo sapiens 
cDNA clone IMAGE: 3615538 5'. 
Length = 637 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand - Plus / Plus 

Query: 1 acctccctccgcgg 14 

Illillllllllll 
Sbjct: 422 acctccctccgcgg 435 

>gi|8140649lgblAW950985.1IAW950985 EST363055 MAGE resequences, MAGA Horn 
o sapiens cDNA. 

Length = 638 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 
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Query: 2 cctccctccgcgga 15 

llllllllllllll 
Sbjct: 273 cctccctccgcgga 286 

>gil8139665lgb|AW950129.1|AW950129 EST362094 MAGE resequences, MAGA Homo 
sapiens cDNA. 

Length =611 

Score = 28.2 bits (14), Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 2 cctccctccgcgga 15 

llllllllllllll 
Sbjct: 273 cctccctccgcgga 286 

Database: GenBank Human EST entries 

Posted date: Mar 29, 2002 2:35 AM 
Number of letters in database: 2,114,234,064 
Number of sequences in database: 4,280,058 
Lambda K H 

1.37 0.711 1.31 
Gapped 

Lambda K H 

1.37 0.711 1.31 
Matrix: blastn matrix: 1 -3 
Gap Penalties: Existence: 5, Extension: 2 
Number of Hits to DB: 5013 
Number of Sequences: 4280058 
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Number of extensions: 5013 

Number of successful extensions: 5013 

Number of sequences better than 10.0: 61 

length of query: 16 

length of database: 2,114,234,064 

effective HSP length: 15 

effective length of query: 1 

effective length of database: 2,050,033,194 

effective search space: 2050033194 

effective search space used: 2050033194 

T: 0 

A: 30 

XI: 6 (11.9 bits) 
X2: 15 (29.7 bits) 
Si: 12 (24.3 bits) 
S2: 14 (28.2 bits) 
Top of Form 



1: BM915385. AGENCOURT_6701642...[gi:19365764] 



Taxono 



LinkO 



ut 



IDENTIFIERS 



dbEST Id: 



11598757 



EST name: 



AGENC0URT_6701642 



GenBank Acc: 



BM915385 



GenBank gi: 



19365764 



CLONE INFO 



Clone Id: 



IMAGE: 5481560 (5') 



Plate: 



LLCM2006 Row: d Column: 09 



miE#2 003-3072722 



^jfH 2002-235294 



^-v: 89/ 



DNA type: 
PRIMERS 
PolyA Tail: 
SEQUENCE 

CCCG 

CGGG 

GAGT 



cDNA 



Unknown 



GAGG 



CTGG 



TGGG 



CACG 



CTCG 



CCTG 



TCCG 



GCCT 



TCCC 



CGCCCTGGGCCATCTCCCTCCCAmimiCjGimGCAGCCA(^CAGCGAGG(}C 



GCCGGGGGCA«}(XK}GAC(XCCC(n€CGG(X;CACCCCCCCGGCTCTGAGCCGCCCG 



GCCGGCCTCGGCCCGGAGCGGAGGAAGGAGTCGCCGAGGAGCAGCCTGAGGCCCCA 



CTGAGACGAGCCGCCGCCGCCCCCGCCACTGCGGGGAGGAGGGGGAGGAGGAGCGG 



AGGGACGAGCTGGTCGGGAGMGAGGAAAAAMCTITTGAGACTTTTCCGTO 



GA(X:C(XJAGGCGC(X}GGACCTCTTGGC(X:GACGCT(JCCCCGCGAGGAGGCAGGACT 



GACCCCAGACC(£CTCCCTTTGCCGCCGGGGACG 



GCGTCC(^GGCaCCCCATTCCG(^CCAGCCCTC(X;GAG^ 



CAMGACTirTCACCATACCTCGGGCGCACCCTCTGCACGCGGCn^ 



TCTA(TOAGCCCCCGCGGAT(XmGACCCmra^ 



ACCTGCCGCAMTTCCCTATTCT(ftMCACCCCCGC™^ 



TTCGACGCTCCTrGCGCTGGGGMCTGMGAGCCCCCGGGTTCGTMCCTTTTCCT 



CGTTTTGAAAMCATCCCXXGTTMTAMCCTTO 
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CCCT 

tacggtttttggcgg(x;actamcamcatcgagtctcmggcggcggatgccact 

CAAG 

CCTGMTACTITTGCGCGTTAGGGGCGG 

ACCG 

GACCCTATTCATTGGTTTCCCCT 

CCCG 

ACACATTGTCATAAMCACCAOTTCGACAC^ 

CCTC 

CCCGCGTGTAAMTTTCCCGCG(XMTGCCCT^ 

GGGG 

TCGGCN 

Quality: High quality sequence stops at base: 467 

Entry Created: Mar 11 2002 
Last updated: Mar 12 2002 

COMMENTS 

Tissue Procurement: DCTD/DTP 

cDNA Library Preparation: Rubin Laboratory 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL 

) 

DNA Sequencing by: Agencourt Bioscience Corporation 
Clone distribution: MGC clone distribution information c 

an 

be found through the I.M.A.G.E. Consort ium/LLNL at: 
http ://image. 1 lnl . gov 

LIBRARY 
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Lib Name: 
Organism: 
Organ: 

Tissue type: 
Lab host: 
Vector: 
R. Site l: 
R. Site 2: 
Description: 

G(G 

ng 
T 

SUBMITTER 

Name: 

E-mail: 

CITATIONS 
Title: 

Authors : 
Year: 
Status : 



NIH_MGC_41 
Homo sapiens 
skin 

amelanotic melanoma, cell line 

DH10B (phage-resistant) 

pOTB7 

Xhol 

EcoRI 

cDNA made by oligo-dT priming. Direct ionally cloned into 
EcoRI/XhoI sites using the following 5' adaptor: GGCACGA 

). Library constructed by Ling Hong in the laboratory of 
Gerald M. Rubin (University of California, Berkeley) usi 

ZAP-cDNA synthesis kit (Stratagene) and Superscript II R 

(Life Technologies). Note: this is a NIHJiGC Library. 



Robert Strausberg, Ph.D. 
cgapbs-r@mai 1 . nih. gov 



National Institutes of Health, Mammalian Gene Collection 
(MGC) 

NIH-MGC ht tp : //mgc . nc i . nih. gov/ 
1999 

Unpublished 
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Bottom of Form 
Revised: October 24, 2001. 
Check on Est in Genbank: 
Query= (1086 letters) 

Database: All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS, 
GSS, or phase 0, 1 or 2 HTGS sequences) 

1,205,903 sequences: 5,297,768,116 total letters 

Score E 

Sequences producing significant alignments: (bits) 
Value 

gi 1 10863872 1 ref |NM_000660. 1 1 Homo sapiens transforming grow. . . 587 
e-165 

gi 1 18590091 1 ref |XM_085882. 11 Homo sapiens similar to transf. . . 587 
e-165 

gi 1 11424057 1 ref |XM_008912. 1 1 Homo sapiens transforming grow. . . 587 
e-165 

gi|7684381|gb|AC011462.4|AC011462 Homo sapiens chromosome 1. . . 587 
e-165 

gi 1 37097 1 emb I X05839. 1 IHSTGFBG1 Human transforming growth fa. . . 587 
e-165 

gi 1 37092 1 emb I X02812. 1 1 HSTGFB1 Human mRNA for transforming g. . . 587 
e-165 

gi 1 340526 1 gb I J04431 . 1 1 HUMTGFB1PR Homo sap iens transforming ... 587 
e-165 

gi 1 12654682 1 gb|BC001180. 1 IBC001180 Homo sapiens, Simi lar to. . . 291 
8e-76 
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gi|12652748lgb|BC000125.1|BC000125 Homo sapiens, Similar to. . . 291 
8e-76 

gi 1 18490115 1 gb I BC022242.il Homo sapiens, clone MGC:22008 IM... 153 
4e-34 

gi 1 755044 1 gb I M23703. 1 1 PIGTGFB1A Sus scrof a transforming gro. . . 129 
6e-27 

gi|7650477|gb|AF249327.1|AF249327 Rattus norvegicus TGF-bet. . . 66 
8e-08 

gi 14416081 lgb|AF105069.1 1 AF105069 Rattus norvegicus transfo. . . 66 
8e-08 

gi|2394170|gb|AF015683.1|AF015683 Rattus norvegicus transfo. . . 66 
8e-08 

gi 1 6755774 1 ref I NM_011577. 1 1 Mus musculus transforming growt. . . 64 
3e-07 

gi|1161133lgb|L42456.1|MUSTGFlG01 Mus musculus TGF-1 gene, ... 64 
3e-07 

gi 1 3688423 1 emb I AJ009862. 1 1 MMU009862 Mus musculus mRNA f or t . . . 64 
3e-07 

gi 1 201947 1 gb I M57902. 1 1 MUSTGFB1 Mouse transforming growth f a. . . 64 
3e-07 

gi 1 18042365 1 gb I AC097483. 3 1 Homo sap iens BAC clone RP11-146N. . . 44 
0.30 

gi 1 17481821 1 ref I XM_008785. 3 1 Homo sapiens one cut domain, f. . . 44 
0.30 

gi 1 12737997 1 ref |XM_007116.2| Homo sapiens Zic family member... 44 
0.30 

gi 1 6005961 1 ref I NM_007129. 1 1 Homo sapiens Zic f ami ly member ... 44 
0.30 

gi 1 11065969 1 gb I AF193855. 1 1 AF193855 Homo sap iens z inc f inger. . . 44 
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0.30 

gi 147588471 ref I NM_004852. II Homo sapiens one cut domain, fa... 44 
0.30 

gi 1 15787728 1 emb I AL355338. 33 1 AL355338 Human DNA sequence f ro. . . 44 
0.30 

gi 14028591 1 gb I AF104902. 1 IAF104902 Homo sapiens ZIC2 protein. . . 44 
0.30 

gi 1 1531593 1 gb IU50523. 1 IHSU50523 Human BRCA2 region, mRNA se. . . 44 
0.30 

gi|4468940|emb|Y18198.1IHSAY18198 Homo sapiens mRNA for ONE... 44 
0.30 

gi ] 19067958 1 gb I AY049805. 1 1 Alopias pelagicus 5. 8S ribosomal. . . 42 
1.2 

gi 1 18025465 1 gb I AY037858. 1 1 Cercopithicine herpesvirus 15 st. . . 42 
1.2 

gi 1 12039248 1 gb I AC020659. 5 1 AC020659 Homo sapiens chromosome ... 42 
1.2 

gi 1 19909461 1 gb I AC098709. 3 1 Mus musculus clone RP23-1K14, co. . . 40 
4.6 

gi 1 19921137 1 ref |NM_135651. 1 1 Drosophi la melanogaster (CG47. . . 40 
4.6 

gi 1 18376846 1 gb I AC092198. 2 1 Homo sapiens chromosome X clone ... 40 
4.6 

gi|18467841|ref|XM_078995.1l CG4751 (CG4751), mRNA 40 
4.6 

gi|18376869lgb|AC091898.2l Homo sapiens chromosome 5 clone .. . 40 
4.6 

gi|18030132lgb|AC026695.5l Homo sapiens chromosome 5 clone ... 40 
4.6 

£U«E#2 003-3072722 
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1 15887302 1 gb I AC020914. 8 1 Homo sapiens chromosome 19 clone... 40 

6 • 

1 14578122 1 gb I AC092241. 1 1 AC092241 Drosophi la melanogaster, . . .• 40 
6 

1 15292266 1 gb I AY051978. 1 1 Drosophi la melanogaster LD44770 ... 40 
6 

1 15055218 1 gb I AC060226. 39 1 Homo sapiens 12 BAC RP11-101P14. . . 40 
6 

1 14389338 1 gb I AC084282. 6 1 AC084282 Oryza sat i va chromosome ... 40 
6 

1 13677167 1 gb I AC015977. 9 1 AC015977 Homo sapiens clone RP11-. . . 40 
6 

1 9910225 1 ref I NM 020179. 1 1 Homo sap iens FN5 prote in (FN5) .... 40 
6 

1 10440613 1 gb I AC069145. 5 ! AC069145 Oryza sat i va chromosome ... 40 
6 

U0728714|gb|AE003631.2|AE003631 Drosophi la melanogaster .. . 40 
6 

l9246422lgb|AF197137.1|AF197137 Homo sapiens FN5 protein ... 40 
6 

|4190938|gb|AC000091.1|AC000091 Homo sapiens Chromosome 2. . . 40 
6 

1 17431932 1 emb I AL646085. 1 1 AL646085 Ralstonia solanacearum ... 40 
6 

U5073719|emb|AL591785.1|SME591785 Sinorhizobium meliloti. . . 40 
6 

1 3628578 1 gb I AC005115. 1 IAC005115 Drosophila melanogaster D. . . 40 
6 

1 3150432 1 gb I U50080. 1 1 LSU50080 Lymnaea stagnal is serotonin. . . 40 

aJSE#2 003-3072722 
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4.6 

gi|8052359|emb|AL356592.1|SC9Hll Streptomyces coelicolor co. . . 40 
4.6 

gi 1 66246401 emb I AL034344. 24 IHS118B18 Human DNA sequence from. . . 40 
4.6 

gi 1 15528721 1 db j I AP003296. 3 1 Oryza sat iva (japonica cult ivar. . . 40 
4.6 

gi 1 15289781 1 db j I AP003141 . 2 1 Oryza sat iva ( j aponica cul t ivar. . . 40 
4.6 

gi 1 6069643 1 dbj IAP000616.il Oryza sat iva (japonica cult ivar-. . . 40 
4.6 

gi 1 960285lgb IL46862. 1 IRATLAMB2G Rattus norvegicus laminin B. . . 40 
4.6 

gi 1 198704 1 gb I J03749. 1 1 MUSLAMB2B Mouse laminin B2 gene, exon. . . 40 
4.6 

gi 1 198702 1 gb I J02930. 1 1 MUSLAMB2A Mouse laminin B2 chain mRNA. . . 40 
4.6 

gi 1 198694|gbl J03484. 1 IMUSLAM2B Mouse laminin B2 chain mRNA, ... 40 
4.6 

Alignments 

>gi 1 10863872 1 ref |NM_000660. 1 1 Homo sapiens transforming growth factor, 
beta 1 (Camurati-Engelmann 

disease) (TGFB1), mRNA 
Length = 2745 

Score = 587 bits (296), Expect = e-165 
Identities = 356/377 (94%), Gaps = 1/377 (0%) 
Strand = Plus / Plus 



ffifiE#2 003-3072722 



€2002-235294 W ^- v : 



Query: 246 cgagctggtcg^agaagaggnnnnnnncttttgagacttttccgttgccgctgggagcc 
305 

iMiiiiiiiiiiii mi iii iiiiiiiiiiiiiiiiiiimiiiiiiiiii 

Sb j ct : 225 cgagctggtcgggagaagaggaaaaaaact 1 1 tgagact 1 1 tccgt tgccgctgggagcc 
284 

Query: 306 ggaggcgcggggacctcttggcgcgacgctgccccgcgaggaggcaggacttggggaccc 
365 

II IMMI IMMII III III 1 1 MINI I MM III I II III MM MM I Ml III 1 1 
Sbjct : 285 ggaggcgcggggacctcttggcgcgacgctgccccgcgaggaggcaggacttggggaccc 
344 

Query: 366 cagaccgcctccctttgccgccggggacgcttgctccctccctgccccctacacggcgtc 
425 

MIMM I IMIMIII I MMIMIMI II III III Ml II MM MIMM II Ml II 
Sbjct : 345 cagaccgcctccctttgccgccggggacgcttgctccctccctgccccctacacggcgtc 
404 

Query : 426 cctcaggcgcccccattccggaccagccctcgggagtcgccgacccggcctctcgcaaag 
485 

MMMMMIMM Mill III MIMMMIMM III MIMMIMM IMMII 
Sbjct : 405 cctcaggcgcccccattccggaccagccctcgggagtcgccgacccggcctcccgcaaag 
464 

Query : 486 act 1 1 tcaccatacctcgggcgcaccctctgcacgcggcct tcatcaccggcctgtct ac 
545 

MIMM Ml IMIIIIIIIIIIII llllllll IIIMIMI MMIMIMI I 
Sbjct : 465 acttttccccagacctcgggcgcaccccctgcacgccgccttcatccccggcctgtctcc 
524 
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Query : 546 tgagcccccgcggatgcctagaccct t tctcctccgggagacggatccctctccgacctg 
605 

llllllllllll II Ml 1 1 1 II I II I II ! II 1 1 I) 1 1 1 1 1 1 1 M IIIIIIMIIII 
Sbjct : 525 tgagcccccgcgcat-cctagaccctttctcctccaggagacggatctctctccgacctg 
583 

Query: 606 ccgcaaattccctattc 622 

II II II MINIM 
Sbjct: 584 ccacagatcccctattc 600 
[0 0 8 3] 

Example 6: Statistical analysis of 5' end sequence tags 

5' end sequence tags obtained from the same plurality of mRNAs in a sa 
mple or nucleic acid fragments within the same cDNA library can be analy 
zed by a standard software solution like NCBI BLAST (http://www.ncbi.nlm 
.nih.gov/BLAST/) to identify non-redundant sequence tags as describe in 
Example 5. All such non-redundant sequence tags can then be individually 
counted and further analyzed for the contribution of each non-redundant 
tag to the total number of all tags obtained from the same sample. The 
contribution of an individual tag to the total number of all tags should 
allow for a quantification of the transcripts in a plurality of mRNAs i 
n the sample or a cDNA library. The results obtained in such a way on i 
ndividual samples can be further compared with similar data obtained fro 
m other samples to compare their expression patterns. 
[0 0 8 4] 

Example gapping of 5' end sequence tags to genomic sequence inforaati 
on 

5' end sp|||i:fic sequence tags obtained as describe in this Example can 
be used to identify transcribed regions within genomes for which partia 
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1 or entire sequences were obtained. Such a search can be performed us in 
g standard software solutions like NCBI BLAST (http://ww.ncbi.nlm.nih.g 
ov/BLAST/) to align the 5' end specific sequence tags to genomic sequenc 
es. In the case of large genomes like those from human, rat or mouse it 
may be necessary to extend the initial sequence information obtained fro 
m concatemers for example by the approach describe in Example #. The use 
of extended sequences allows for a more precise identification of activ 
ely transcribed regions in the genome. 
[0 0 8 5] 

Example 8: Identification of transcriptional start sites 

5' end specific sequence tags, which could be mapped to genomic sequen 
ces, allow for the identification of regulatory sequences. In a gene the 
DNA upstream of the 5' end of transcripted regions usually encompasses 
most of the regulatory elements, which are used in the control of gene e 
xpression. These regulatory sequences can be further analyzed for their 
functionality by searches in databases, which hold information on bindin 
g sites for transcription factors. Publicly available databases on trans 
cription factor binding sites and for promoter analysis include: 
Transcription Regulatory Region Database (TRRD) (http://wwwmgs.bionet.ns 
c. ru/mgs/dbases/trrd4/) 
TRANSFAC (http://transfac.gbf.de/TRANSFAC/) 
TFSEARCH (http ://www. cbrc. jp/research/db/TFSEARCH. html) 
Promoterlnspector provide by Genomatix Software (http://www.genomatix.de 
/) 

[0 0 8 6] 

Example 9: Cloning of full-length cDNAs using information derived from 5 
end sequence tags 

Sequence information derived from the concatamers can be used to synth 
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esis specific primers for the cloning of full-length cDNAs. In such an a 
pproach, the sequence derived from a given 5' end specific tag can be us 
ed to design a forward primer while the choice of the reverse primer wou 
Id be dependent on the template DNA used in the amplification reaction. 
Amplification by the polymerase chain reaction (PCR) can be performed us 
ing a template derived from a plurality of RNA obtained from a biologica 
1 sample and an oligo-dT primer. In the first step the oligo-dT primer a 
nd a reverse transcriptase are used to synthesis a cDNA pool. In the sec 
ond step a forward primer derived from a 5' end specific tag and an olig 
o-dT primer are used to amplify a full-length cDNA from the cDNA pool. S 
imilarly, a specific full-length cDNA can be amplified from an exiting c 
DNA library using a forward primer derived from a 5' end tag and a vecto 
r nested reversed primer. 
[0 0 8 7] 

Example 10: Alternative approaches for the cloning of 5' -end tags from c 
DNA libraries. 

A plurality of cDNAs can be amplified from an exciting cDNA library ha 
ving a recognition site for a class lis endonuclease at the 5' end of th 
e inserts. The PCR products derived from such a library would be further 
treated as described in the examples herein. 
[0 0 8 8] 

Example 11: Cloning of 5' ends by replacement of the Cap structure by an 

oligonucleotide having a class lis recognition site 
A cDNA/RNA hybrid encompassing the 5' end of an initial transcript can 

be obtained as described in Example 1. The Cap structure in such cDNA/R 
NA hybrids is then enzymatically removed by a hydro lyz ing enzyme such as 

the T4 polynucleotide kinase or the tobacco acid pyrophosphatase. A sin 
gle or double stranded oligonucleotide having a class lis recognition si 
te is then ligated by T4 RNA ligase to the RNA at the phosphate present 
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at the 5' end of the de-capped mRNA. The ligated oligonucleotide will fu 
net ion as a primer for the second- strand synthesis following the procedu 
re given in Example 1. By the use of a modified oligonucleotide in the 1 
igation step the double stranded cDNA can be attached to a support and u 
sed for the cloning of concatamers as described herein. 
[0 0 8 9] 

Example 12: Amplification step for a sample 

In cases where the amount of a sample is limiting to the invention, th 
e sample material can be amplified by the following approach. In a first 
step a plurality of mRNAs is treated as described in Example 11 to repl 
ace the cap structure by an appropriate oligonucleotide having a class I 
Is recognition site. In a second step the aforementioned template is amp 
lified by a PCR step using a primer complementary to the linker and a po 
ly-A primer. The PCR product can be used for the invention as described 
in the Examples 1. 
[0 0 9 0] 

Example 13: Utilization of extended 5' -end sequences 

Initial 5* end sequences obtained for concatamers can be used to synth 
esis sequencing primers to obtain extended sequence information on the 5 
9 end of a transripted region. 

[0 0 9 1] 
Example 14: Gene inactivation 

Sequence information obtained from 5' end specific sequence tags can be 
used for the design of anti-sense probes, which could be applied in knoc 
kdown studies. 

[0 0 9 2] 

By the present invention, novel means by which not only the informatio 
n on the nucleotide sequences of mRNAs contained in a sample may be obta 
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ined, but also novel genes could be cloned. By the method of the present 
invention, information on the nucleotide sequences of the 5' end region 
s of a plurality of nucleic acids said mRNAs and cDNAs in the sample cou 
Id effectively be obtained. Since the information on the nucleotide seq 
uences of the 5' end regions is obtained, unknown genes can be cloned af 
ter the identification of a novel transcript. Further, it may be possib 
le to attain mapping of transcription start sites, mapping of promoter u 
sage pattern, analysis of SNPs in promoters, creating gene networking by 
combining the expression analysis, alternative promoter usage and the o 
ther data in this disclosure, and selective recovery of promoter regions 
in fragmented genomic DNA. 
[0 0 9 3] 

In particular, the invention has a great impact on identification, clo 
ning and further analysis of promoter regions. After sequencing concatam 
er libraries holding information on a plurality of 5* ends, a statistica 
1 analysis on the distribution on the transcriptional start sites will b 
e possible. Changes between different physiological conditions switch th 
e mRNA transcription machinery into new "status". Such a "transcript iona 
1 status" can measured by computing (1) the presence of the transcript io 
n starting points, (2) the digital expression of the various transcript i 
onal factors by counting their expression by counting the tags, and corr 
elating the presence of starting point, the transcription factors. More 
information will be obtained on the gene networking by comparing the per 
turbation of gene expression between two different conditions. Such comp 
arisons of transcriptional conditions between various disease and normal 
tissues could allow for the design of new and very comprehensive diagno 
stic tools. Thus the invention will be of high commercial value in gene 
discovery and gene analysis, and it is envisioned that the invention wil 
1 be of use in the development of novel diagnostic and therapeutic produ 
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cts. 

[0 0 9 4] 
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4 Brief Description of Drawings 

[mu 

An example for the preparation of a plurality of 1 st strand cDNAs is pr 
esented, where the starting material can be RNA derived from a biologica 
1 sample or a cDNA library. 
[®2] 

An example for the cloning of 5' -end specific tags into concatemers is 
presented. The example is including but not limited to the use of the r 
estriction enzymes Gsu I, Bgl II and Eco RI. 
[HI 3] 

An example for a 1 st linker to be used for the cloning of 5' -end speci 
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fic tags is presented. The example is including but not limited to the u 
se of the restriction enzymes Bgl II, Gsu I, and Mme I. 
[IB 4] 

An example for a 2 n d linker to be used for the cloning of 5' -end speci 
fic di-tags is presented. The example is including but not limited to th 
e use of the restriction enzymes Bgl II, Gsu I, Mme I and Eco RL 
[H5] 

An example for the structure of a di-tag is presented. The example is i 
ncluding but not limited to the use of the restriction enzymes Bgl II, G 
su I, Mme I and Eco RI. 
[H6] 

An example for the use of a 5' -end specific linker is presented, in wh 
ich the linker is used for the enrichment of individual nucleic acids an 
d their sequencing. 
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Binding to Support 
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High throughput serial sequi 




Fig.6: Serial Sequencing 
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1 Abstract 

A method is disclosed to obtain the 5' ends of transcribed regions 
from a plurality of nucleic acid fragments obtained from biological mate 
rials or synthetic pools. DNA fragments encoding the 5' ends are enriche 
d for their individual analysis or for the analysis of concatamers there 
of. The sequence information derived from 5' ends can be used for charac 
terization and cloning of the transcriptome. 

2 Representative Drawing None 
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lm%m-%] 100088546 

03(3238)9182 

# k mm l tzmm * & < miE ta b *m ^mm t tz 
w-mmmmwvmmx i 
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So 

[»*£4l mRNA© 5 ' ^fRit^^^^^^tf^O c DN 

A^fcftSStifc* ^>fcf hn-e^^ttfcRNA. X&1W*<7>? ^5 
V-*«FHfc LTfflv^r^JSR^^bDNASrS^WtclHlJR-tSlllXSgfc^ 111 
JRSfrfccDNA*©, BUfBmR N AO 5 ' 5fc#«#MB8rW&«i*£4>fc < 
^^trcDNA«**^tr»f>T-«:a^«Jtc|5lJR^||!p2Xei:, 0JRStL^Wf>i 
*lv^:MLtlM^Mt^f 3 Xg t £#tf\ SfcH-*©te¥KMIfc-*-& 

[m^m 6 ] flrSB&lXfite, mRNA^lCUi-IcDNA^ 
a-T^X^h, mRNAW^t^MCI^^tt!|I^$^Igt 

-**rna*w*- axe*:, a«#ys^tt'»Kfca^wtcj»^-t-*, 

im^m7) ^RcDNAS:fin*lIgl^ RNaseCUM 
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[Hf**i 0] ifrfB^^^^v^v^-^-e^^, flrEJftftl 

[si** 1 1 ] m$m^ i &m%\±, ^m^±^m^^itn^mm^ 

li7fy^X e-X-e* * fflf ** 4tfzi*9 %BM<Vjf&o 

[1***1 2] BfrfSfH2I5gfi> ^lXgt?|5IJR$*L7tcDNAO, Tt&fB 

xm~m c DNA^t-siti:, # ^ fitz y -m&zi&m cdna^i 

US** 13] ATE V ^ * - Kf*> »^^*MW**frfr 3 JtTiS 19 , V 
®t^^$^^>X5gt, ^##*|iJJ|X1-^>X^It : l:#tflt^4^v^Ll 2 

[ft** 1 4 ] irEa^ysf&tt^K^if o , i&^v&mm& 
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W*|17] UfrSafWRB^i, Gsu I, Mme I, Bpm IX 
ttBsg I^J:^^7^I IiWKa5*"T?*&iS*^4*v^Ll 6«5V*i*tL^ 
l£fcflB*W>2rt*o 

[»*« 1 9 ] i^ti8 ov>-rtt^ i Tg.KtmviimK 

s mmmix.mzftk.x, ^&&btix^&^mcDNA*mmtt®-tLxm 
mm-h z.t&x£&wniz£ 19 mmm&fm**^ 5 ■ rna 

mmtzz.ttfx%z>®ni l z2: *)mm%m%wimz j t<o5' m&i&mzGirzm 
m 9 yfr^m&yu & wtft oc.DNA7^^7'j-^f)ist§, mm. 9 

<DM&lzm^t>*l2>, mRNAO 5 ' 5 * &W&WL9 ? <omk% 

3] if*m&v>L2 2<D^-r^i^zum.^mz^ 
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[If ^2 4 J 81^2 3<DM#:^tf/<^^- 0 

2 5 ] 1 t 2 2 m*1**L#» 1 &K%5&.<Qjj&K £*)M 

im^tm 2 6 ] ftzmmommm, l tz^ x$m* 

im^m 2 7 ] m^m 1^122 <d^-t 1 ^mm^mtm 

St*$)ot, f$fW<^i&tf>m R N AXfi c D N A i:ov^^if- * *^fif 
[If ^ 2 9 ] fff$£ l^L2 2 <D\,*-?tifr 1 ^ICiBi^o^tc J: 19 p 

3 0 ] 1^122 o v^-r tL^ 1 ^ m smo2r& t ibj 

[If ^3 1] fff$£l ^v^L2 2 0v^^i^jcfe«<7)^i:|lD^ 

im^m 3 2 ] m^m 1^122 ov^rtt^ 1 m^mm^m- zvm 

Mt$£ 3 3 ] 1^12 2 <D^-ftifr 1 ^tCfBmcO^^ J; *) M 

[nasi 3 4 ] it ^ 1^122 o^-ftifr 1 ^^ismo^tc i m 

o 
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3 6 ] 1^12 2 ov»-r l ^tcfB«o^^ i *) m 

So 

3 9 ] — -frm cDNAt HufB^y ^v- OZL^il^jjfc * 'J rf * 

[0001] 
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[0 0 0 2] 

^ift^lR 13 43 It & $ tt J5 ^ § -C* * o 

[0 0 0 3] 

t£*> ^^aa^O^^^^tL^mRNAom^SEyiJO^ti, mRN 
A SrUM K LfcaME¥**UJfl LT c DNA9 W ~f*7 V — *ft« U IcDNA5 
>f^7V-WAc DNABf>T-*-etL-€»*LP^ii:lcj: DfrfrtiTv^fco t£ 

o m&*tm*(OmRl$ AOM^^ ->*M£«U 4Sv>K8IS?3£-C 

[0 0 0 4] 

^$^0 7 7^ V — fl£im::v*;b*9>£DNA^>f ^nTW (J 

o r d a n B. , DNA Microarra yslGene Expres 
sion Applications, Springer— Verlag, Be 
rlin Heidelberg New York, 2001 : Schena 

A, DNA Microarrays, A Practical Appro 
ach, Oxford University Press, Oxford 1 
9 9 9) Zm^Xfft>tlX\<*2> 0 Z.fLb<DmMi(7)tztbK^ m* <Otik&=f'X\*W& 
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[0 0 0 5] 

mRNAoaaiaE3?9fll#*«JSpWtc»*^fei:LT, VISAGE (S e 
rial Analysis of Gene Expression) feffj® 
hfiX^h (Velculescu V. E. et al. , Science 
27 0, 484-487 (1 9 9 5), -O^&ii, »mRNA<03' }£3g 

w*<owmnmm&&triam.<o&\i* aobpujg) ona^^^ lt^i 

ft#DNA*fl|jftU iOMfDN A^)tlE?iJ^^tl) i f) , Z(D 

■ftmiZ£tU$^ »mRNA©3' ^mottHfi^fff £ 
fcri*-r*§&o S AGE&HitUf, 3' *^lifi#<7)Mv^^m*@e^JL^^ 
•C****, mRNA^gfc&OiJr^ 1 0 b pS£©Sv^«oigi|^lJ^^o 

[0 0 0 6] 

L J -5 f*RS] 
SAGEffitU mRNA<0 3 ' ^W&<Dl&mm*%\& CI fc**t?^ 

fflJt|-e**o SAGE«fflv^tLTV^tc^>^*?c>T. SAGEiicDNA^ 
5' ^flfCcDNA^n-^tHS^Sitfc/TLft^o HR^ 4bp<^)^7 
* I I sWRaaf^fflv^tLTv^o 4bp*^-li, ¥^LT. 2 0 

0-3 0 0co^^ l^a-^KfclBSU i*LlimRNA<6H©W»4t'fXo 
f^LTl/lO^^o ^OJ: o KS AGEiEati. 3' «}:ov^fi£< 

fcj&*-c#&»t*L£% % te^A^oijg^os' 5fc»teov>-r»±fiNft£2ll 

ie>*it3&»T?i*v^fc*Sft<^iggLTV^ 0 ££>tc. 10bp©^^t 

l&f-tfi-Z&Zo Lfcri*oT, 10bp<0^7tt> mRNAO— g&Sr^tf Tsa 
ge-tagj <omfe\Z<Ofrm^bflX^k 0 ffifLtfrtWmR'N Af3\ HtfLHl^ 



mtE# 2003-30727 
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<0^yAO(g^$tt^^O3~5%%#tptCdiri-\ #H«J Ts age-t a 

5%<7>-gB£^-if 0 4 b pffiMi£iff±, ^4 4 bp (2 5 6 bp) 4fC5^^A 
fcK^J&3JWr*-*OT\ Tsage-tagJ te, $V *l£ 3 ~ 5 % 

«^l/2 5 6 ^Wt*^ (tm=0. 0 2%*i) 0 Lt^oT 

[0 0 0 7] 
[0 0 0 8] 

t*tbZ> 0 S AGE^H^S' *S^iv^f:fc|;^n^-^- 

^n-->^-t-^tg^^^tcffiv^cii:jij:i9 > ***** 

[0 0 0 9] 

[«W*J!lftt*fc*©^«] 
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*m§twmm*. &%mftoi&^ mRNA© 5 ' jm&&<D&&tmm&* 
irtrvmo&mmK *m$m\mu"t & ztizz 0 mR nauseam* 

[0010] 

A-e&oT. imRNA«5' ^flit^M^J^fl^^^tf c DN AZMlRm 
Km^fhmiJMt, HJJR^ti-fccDNA^^. Ttu f Bm R N A <£> 5 ' 
C**tW*«*fc«*IWft**«r^«: < fc k^-fr c DN AHJtSr^tfKffi-Sril^ 

3 X*l ^«OmRNA0 5' 5jcSI«i*Offi*BB^Jfllfll*'g-tr 

RNA0 5' ^«*©^E^JOJ!H^ffi*Jittr«o Sfcfc, 19 

9§tt$fcfc> ttE£tt#*&iH|^ft&E?0*> m^ORNAf^40ft#fe£^ 
[0 0 11] 
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IC, v - y ^ y 7%Jfc%ft%> ft *b KM ^htlZo 
[0 0 12] 

[0 0 13] 

[**91«>IEit«>2*ft] 

o 

[0 0 14] 

S1XS 

^nuR-t^xs-e^^o 

[0 0 15] 

DNA9-T ttlSo 
[0 0 16] 

c DNA ifM"^^^^ tlX& «9 , £ tL h <D&%KD-%&<D^-ftl £ i> 
t4if:#tl4off*U^^U^ *^y/>9y/<- (cap tra 
pper) S (iHPiero CARNINCI et a 1 . , ME TH 
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ODS IN ENZ YMO LOGY, VOL. 3 0 3, pp. 19-44, 1 

h^-e§^, 0 looft#& (Pelletierf)> 1 9 9 5£SB«$*l* J: ^ 
[0 0 17] 

^ft'/^Cftit, mRNA<7)5' BAP (ilT;l/A'J 7*^ 

77^- if) o^a?**?^-^?^; >SMbu ^^^77°^ 

mRNA<£> 5 ' *^;:RNAV ST— VKX Z t&X»£ 2> (Maruy 

ama^S u g a n o) 0 ZOX^KLX, 0Jx.wr, 5^-y3>IgOR| 
K> ^V^^l^^-^K/V^^V^^KE^JJi^ cDNAX(iRNA©5 
' 5fc»fcx ^7^1 I sS«$t*i<ii:^l4o ci^^^^i 1 sfifllM* 
fgfi, cDNA*PIIU5' *I^/^mci^fl5o 
[0 0 18] 

et^>JC>ftx.T^^^ y^tt^^/^® (Pelletier et a 
1. ,Mol. Cell. Biol. 1995 1 5:3 3 6 3-7 1)^ * 

[0 0 19] 

G e n s e t KX QmmZiXtzXi * * y 7'#jt K * U =f* * 
^^KfcW^Wfctt^fcifcfc-CS* (*@#^6, 0 2 2, 7 1 5) o d 

*t«\ (i) I I sSIStgwaSr-^if*'; =f*^ v-^-f-K^^^^^c^ 
(2) H-m^> ^-c% itt^^ziil^i'-^omx.^- 1 ; m 

cDNA^^7^IIsif«U5' *^£fERU ^v>-e 
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[0 0 2 0] 

5, 9 6 2, 2 7 2) &JflV*S £ & T§ ^ Q #Bfe£B«1-* i £^T~§ 

T\ lie DNA^iU ^?§f,|:i^PCR (®IxJ£, SMART ( 
[0 0 2 1] 

lto*#M-cte> RNA©p n pJt!Wta#i:ii> 7? 4 5 J: ») c DNA 
^ (Carninc i et al.Biotechniques, 2002) 

[0 0 2 2] 

T^^-fi. t'^'dT77^-7-tUv^U ilRNA^mRNA©t^ 

^¥»*^©mk$tLTM<e^<o^*»teo±t-*o-c»t lv^o s/s, dCT 

2Lv> 0 £fz, §-IcDNA^t> C TAB (-bf ;VfV^f;V7>^7 
A7n= K) *ro#»r l?, XiicDNA^ttm?-* i i?-jt3:ift&:fr 
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[0 0 2 3] 

fll3£l±, mRNA© 5 ' ttKlf&t zmfcZlb 1) , giRNA ( t R N A) * 
y#y-ARNA ( r RNA) fcli^ftLfcvv, ^T, ffi#&#fl-t It^R N 
A«:fflv>fc^^ S^^tt%K**«f^$tL*OtimRNAO*-C**o £ 
tz, mRNAT^oti 5' 5fc»«?** y ^ffi&tfWM £ flX L £ oTV^ 
O *±««y*^*& Jt t± ft-a- $*i&v> 0 mRNA^f; T^ljS ^ * 7 > 

4MD©2FjS£ i Off -5 d^^-Ci^o Milt mRNA£Na 

i o 4 o x i %Miffl\x~$8Mir2> ztKz 7m7&±<Dv*—frm&mk 

^^t^it^t^o &&</*ii, ^ftcDNAcO^^HPcO^H^^P, 
[0 0 2 4] 

RNase I £ J: !K — ^R N A ZWWilr&o $>2>^lt, — 

*«RNAtftftfS4)5 , cDNA/RNA^'frv y KSriDWCS 4v»|&OR 
Nase, Xf«^0— ^iIRNA*®^0#Mtt-C-SJIIfi-^i t^T'^^RN 
ase<0SWMv»4i^|4 o i-jKcDNA*»RNA©5' 5jd»^W 
EtSf ^-e#I$^^o^RNA/c DNA^^f y Kti„ RN AO 5 

(D—^moUftr-WmztiXLt^ **y7m&tf£:t>ti2>o lot, 
«£J:>K **y7l*&&jJi#LTV>£0«\ mRNA<7)5' 5|CS»*t?^(C# 
SUmRNA/cDNA^^fyj y K<7)«^£&£ 0 
[0 0 2 5] 
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9r^^^tf-X, 7f-; **«telf-X, T^fn-^tf-X, *°'J.x^V>H" 
-X, -b^rn-^e-XXtiT^^V^i^^te^fl-^^ #?L1£2f9*e- 

[0 0 2 6] 

^ty/iMfnmRNA/cDNAA^yj y F£3£##-hCTlMfrt & 
o ^^^tttr-X^^^fi, ^^^ttT^tt^-X^S^t-HlilXl-^i 
fc^^^>o -LfB<7>M*K Z.<D&mX~*W?Mi&$:mi>X^2><Dl l ^ mRNA 
<7)5' *S*t*^:#Stf:mRNA/cDNAA^riJ y Y<DfrX$>2>frb 
, ^tLlCj; f)mRNA05' ^^l^Mfitl^^^^tf c DN A^iRfitJ^ 

^<^^^f dm^^^^^JiODNA^V- t RNAtMLtrn 

RNAXttt V =0< ^ l^f- K<7) J; 9 ; vlLYtT^ 
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[0 0 2 7] 

Lfcj&s, mRNAO 5 ' ^*ttfcffl*|f»fc***'&t> c DN AfcjWRttfcEHX. 
[0 0 2 8] 

«<»2Ig-Ctt> mRN AO 5 ' *ffl«#K:|BffiWfcft#fcffiffiWfc^#£ 
< t *>#tr c DNA«*«r#trllf^«:»R»^iaJR-r*o 
[0 0 2 9] 

*Ui, t^#^«x.lJN a OHO i ^ ^7J^ V f felt & Cl ti:i Off "5 
**-e§&o **V*I4, 7;V#V Mtx.T> RN a s e H (DNACA>fyiJ^ 

JMlCfclK mRNA/cDNA^^/'J 7 KHSW^lIU ZbK, m 
RNAii^$^LT^-0c DNAfmtm&o 
[0 0 3 0] 

[0 0 3 1] 

^-«P^fc*^&< ti>%1r2>V -hifimRNAOS' 
*>cDNA<7>3' \zHJfot2>> iOi-icDNAO*»l:Iin o f 

T?5* 3fe*E^J*^*£*&#fc*n-r:y^a;fcae>fc:, mZ-tf??XI I siW 

k^a-t a - tarn* L^&mmxit^ 0 

[0 0 3 2] 

^1->5>'; >*-^ffiv>^<7) < t ^ (SSL LM (single stra 

nd linker ligation method, Y. Shibata 
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et a 1 . , BioTechniques, Vol. 30, No. 6, pp. 
1250-1254, (2 0 0 1) ) £J 0 fi 1 ? £ fc & o I I sM 

eb^h t w\ mmmmmmuii^o^r-mm^m § & MRm^i 0 1 

M£ LTG s u I *^tf e>*L*^*dtHc|S5fe$^a feo-cii^v^o G s u 1 "Cgfe 

si-& —*<Dmm$m$L<o 1 6 b prat, ^oma^fgwio 1 4 b p 

£# ? cDNAc94»&C3fc&«i;9 K % bt)t>l^ cDNAOJfc«aa95' fifl 
ofcmRNA$>3' 48) OF*3«HC$T3fc£ J: ? £^ ±159 V 

[0 0 3 3] 

±IE»— lie DNA£. ZOXdlkV V>*-<D7> 

fc— & c DN A ZMM£ LtfrlcDNA^t^o COIigftifffil: 
i Off d 

[0 0 3 4] 

cDNA«5' (»rfflcDNA©5' * 

i summm£ ugsui Mm$mM&*:±&?>rj**v =*v-sb 

JWi, £2«cDNA<*>5' 3311100* (tmraRNA<D5' fll^ldMtt) 
^m^l6bp (fcrtfUBttflUil 4 b p) #tr 0 Mmel^ 



ffiSE#2 003-3072722 



€2 0 0 2-2 3 5 2 9 4 ^ ^- V '. 17/ 



o 

[0 0 3 5] 

-IcDNA*^ 1uiamRNA05' < t 

tf c D N A^it £^«if W £ HI 4Xf & ^ 2 1*1^^ 1" & o 
[0 0 3 6] 

^2X^^SSLLMHiJ5fi : d^Hov^rmBJt^\ f2Itli 
itttcm^^tL^>^co^{i^<, ^-ilcDNA03' 33Mtt (Mt^o^: 
mRNA© 5 ' fc^trltftf* III JR-C§ fc^&TfcfUr, <tfeco 

#i*fclfcJH*r& 5' —3' fflflPStifcjSJK-e* 

K<£ *K l-icDNA^S' ■MMWic, (§Ii^ofemRNA(05' 5fc»^*fc 
[0 0 3 7] 

*OmRNAtt*Sk^fiEU i£> V >#-Ji±fEO J: d V rfv- 

^ItLfxO^OifM-^m^L, (3>*tH 4r^i"&o c DNAIf 
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JIM* * n > 9-* %> i )Wi Lv^i&T L & ^H-etl* v* G 
[0 0 3 8] 

NA©5' *M^^ie^J^^n^>«r i^fiSo G s u IXiiMme I *M 
v^jift Lv>^#-e(i, #mRNA(05' 5fc3MWJW>**: 1 6mSXii2 OfS 
«5MSS^J^-5ui;^t^l. 0 1 6m*Xfi2 Oi^Sabttif, j&tf-3fe»tCmR 
NASrlit?]ii^lH#^-e§, tfc, -#r«m R N A £ & o % 

ffrfl^ft&ffim^^tUf, m^#^^tt^±|eifJt<7)ifc (£F£L<ii2 0 
-3 0fl) WmRNAO 5 ' J&#ffi&<OW&gM*&i& - 

^•*f £ £ sMtt * £ § £ o 
[0 0 3 9] 

5' 5|t*O^B^^^«?mRNA##<ELTV»fc»#, -TOU^S: 7 * 7 
- Vmoy^J^-k U V^-^fflOti^V ^dT^-/7^v-hLTRT-P 
CR£?r 9 dfctcJ: 19. IflmRNAS^cDNAHI^^t^l^o 
v*{*, NASBAftfCi l)mRNAS:iilB"t*^i:<>^ri6"^**o I^t, ^ 

y5'J-HOcD N ABffrO— IfBXtt^ft SriiJfii" * #> Kffi ^ * - ^ § 
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[0 0 4 0] 

±|B^fe*Ctt, ^OmRNAX^RNA^ajMH-t Ltffiv^^ ¥ 

nm-?&&o Z<D?jm~£tl&, g^S c D N A 7 ^ 7*7 !J - J tl^M<7) 
c DNAO 5 ' 33Mtt ("T icDNAOilC^o/;mRNA0 5' 

[0 0 4 1] 
[0 0 4 2] 

^-«*0«R»iaJR (mR N A<£> 5 ' *5|§«c |WJ CffiafeK^iJfc^aBffi- ( 



tbSE# 2003-3072722 



m 2002-235294 W v I 20/ 



[0 0 4 3] 

•t&^tttf-xnjg^u s sLMfcwfliicaasu ijudnas- 

7fcffl#ff6, 352, 8 2 8 ; 6, 306, 5 9 7 ; 6, 280, 9 3 5 ; 6, 
2 6 5, 1 6 3; 5, 6 9 5, 9 3 4) »:iot*$*l*OkRti:i|fftW*>- 

[0 0 4 4] 

^J£fi\ *y =/**l/*^KJ± % cDNA60 5' %.1$K%k&ir2> 

Kon- K«Bfc£#tr 0 c DNAtM^'J NXL^^'J 

. x*yja* vr— tfV I I ^aaHi-iit^SSo T^ScJ 

W?- K«rfflv»TK?!I«ra^* t fc*»^§*o tf-XJh<£> c DN 

[0 0 4 5] 

[0 0 4 6] 
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cular Cloning (Sambrook and Russel, 20 
0 1) <D—®imm&V>*?i'a >^fSm$ttTV^ 0 iCCIEifcSti&^-COTU 

[0 0 4 7] 

»"bO^RNA<OlS 

a) ?#M> : 4M ^y--)Af tv7^- K 2 5mM^x>it h 

(pH7. 0), lOOmM 2 7 s h^-* / -^RTfO . 5%n 

b) R N a s e7'J-CTAB/IIII: 1%CTAB (S i gma) „ 4 
Mmm, 50 mM Tris-HCl (pH7. 0), ImM EDTA (pH 
8. 0) 

c) Molecular Cloning (Sambrook and Ru 
ssel, 2 0 0 1) KSBIfc $ ft 7*. 

Molecular Cloning (Sambrook and Russe 

i, 2 o o i) ^mm.^fitzxd^v>mm.mm. (pbs) 0 

5M fiftth'J^ 

7M ttft^r-'^A 

RN a s e7V-IfI'f*>* 

[0 0 4 8] 
^RNAOpM^fciexoyn h rr-;w 
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5 0ml77;^> (falcon) ■?*--7i¥X*®.me>i&%*Mfc1r2>o SJi 
Oiti. #?3; L< ii> 2 0m 1 om&DMtzt) 0. 5~lgf*lio 
2ml02Miith'J')A (pH4. 0) fcillx., 1 6 m 1 <Ofc¥ffiiky 
-^fcimil&o W'^^^t-'Ci^tSe 4m 1 0*nn*;WAfcflnx. 

4lC"e, 6 0 0 0 r pmf3 O53*ra»'fc-t&o 

h (2 5ml) TWc^fa-ycfto I5l20ml Oyfcfg^fc 

4W7 5 0 0 rpmt'15^3tWo RNAii. St^C J; *) ttMt&o 
SCN&LWfB&'t&tilbKy ifcJR * 7 0%^*y-;V^2li]T5fc?#--r 

&o #^Of^^(i7 5 0 0 r pm-C2:53^jt'[yr&o 

CTABt#Il*i*f^ 0 f^*^ RNA&4m 1 (DfcK^K-ftmm L 
tz&s 1. 3ml<7)5M NaCl^, 1 6m 1 <7)CTAB/^fit?:i[] 
x. 0 RNA£^#jK£tIB: 

7500rpm (9500xg) "C 1 5 75*H3jif 'l> U 7jc*B£3t-C&o 
RNA^l/-;h^ 4ml07M GuC 1 tffiSt^o 
#«L*:RNA£, 8 m 1 <Dx.? S - )l>Zl)\\jL2> Z bK£ t)itm.2 J £Z>o -2 
0ttl~2W>^a^-m 4t-C7 5 0 0 rpmtl 5frmM>kir& 
o H5ml<07 0 % J -/l'-eitR 

7 5 0 0 r pmt*5m ffi£at'k*f- & <> 

RNA ^5 0 0-1 0 0 0/i lWRNas e 7 V — Oj^^T * >Pl^7)C (d dH 2 
0) t:#iiStSo 
[0 0 4 9] 
mRNAOll 

TfjJtO^fyK MilfMACS mRNAfi^^h (Mi 1 tenyttl) 
o 1 yA-qu ick (Stratagen ettSD ^Zm^ZZtlZ*. 1) 



mm^2 003-3072722 



m 2002-235294 W v : 23/ 



V>) ftSfeOcDNAtt* ^^ft c D N A OljRIg-eBf^^ tii o po 1 y-A 
[0 0 5 0] 

(f, T3, T 7X&S P 6 RNA^'J^ 9— tffcfflv**:^ >fcf h nofE2£RiSfc 

tiZo mmmmte, -b^^RNAoKf^imi^cM^^o 

-pFLC III (Carninci P. Shibata Y. Havat 

su N. Itoh M. Shiraki T, Hironne T, Wa t 

ahiki A. Shibata K. Konno H, Mu r a m a t s u 

M. Havashizaki Y. ) Balanced— size and 1 
ong— size cloning of full — length, cap- 
trapped cDNAs into vectors of the no 
vel lambda — FLC family allows enhance 
d gene discovery rate and functional 
analysis, Genomics, 2001 Sep ; 77 (1 — 2) : 
7 9-9 0) tip^hih&vJ ~f*7 V -<Dm^, >^ (horn 

i n g) x>Kjsi^vr-- t£I-Ceu IZiiPI-SCe I Ti2J^s & o 

o ffiit<Dtztt>K, ^^-f^X\ 

7"9^ ^ KDN A lOOv^f^DG 

10xi«$ 40vOnL 

nmmrn ioou 
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ddH 2 0 ad 40 0^4 i?uh 

0. 5M EDTA 8-7^^DL 
10% SD S 8v^f^nL 
T'nf'ft-^K (lOmg/ml) 5"70nL 
t#H-£5 0 0W ^nL<7)-7^y-;v/^nn^;VA-eJfim-r^HU^, 4 
1 SSHBW^.x^- hi-^o 7M££5 0 0^^nL«^nD)j!We2SS 

[0 0 5 1] 

KDNA 20v^^nG 

5x T7XiJT3i«$ 2 0 0^^nL 

0. 1M DTT lOOv-T^nL 

2mg/ml BSA 40vOoL 

lOmM rNTPs 507^^nL 

T7XIJT3 RNA*'^7^ lOv^nL 

ddH20 adl 0 0 07^f^DL 
3 7t:-e3~4B#^, ^<Di><D^miJUir^mi'^ >3r^-<- hf&o 

10mMM*;Vy^A 10v-f^nL 

lUA^^nL DNase RQ1 5 

0. 5M EDTA 1 OvODL 

10mg/miynfT-t*K 5^^DL 

^Hffflffi^T ^It^o RNAgftli, ft&WKfc^ <4v7u/is~-ArX 
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frXteTEKnmm't&o RNA^t^aii^ T^n-^^m^ij^^ 

[0 0 5 2] 

2. illcDNA^i 

• 4. 9MMIV^^h-^ (MxJfF 1 u k attftc at # 8 5 5 2 9 

) 

•tt*»:SSjttt (Taka r a) UGC-TaqM 
[0 0 5 3] 

wmx.tmmm 

• R N a s e H -mm&ft Superscript I I ( I n v i t r 

o g e n) Rxrfsmmtmzj&(ommmm 

[0 0 5 4] 

-mmm-m-fy^c?- (o 1 i g0 -dT) 0 

5' — GAGAGAGAGAGGATCCTTCTGGAGAGTTTTTTT 
TTTTTTTTTVN — 3' ) 

ClttlCftx.T, X{i^tL^iainLr^>^A7 p 9^^- (dN 6 -dN 9 ) (fc 

•mRNAo 2. 5~2 5/* g^o 
•&&v*{± % ^RNA, 5-5 O^g 
[0 0 5 5] 

• [ fl -32p] dGTP 

[0 0 5 6] 
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#— ilcDNAfcPK-r&fci&K, =*Oj&fc&0. 5mlPCRfa-7 ( 
A, B, C) i:Aot</^ TIS^ * M&lT h o 
[0 0 5 7] 

A : JJHT^'bO^^iDo 
mRNA 2 ~2 . 5 ^ g 

^RNA 5~50^g 
$-§^7^-7- (2 g/^ 1) H/ig (7//1) 

mim : 22^1 

(mRNA. -794-?-) ^6 5^1 0»DiU mRNA« 

5 X|l-i«« 2 8. 6 /£ 1 

0. 1M DTT 11^1 
d AT P, dTTP, dGTP^5-^f;v-dCTP#10mM 

9.3^1 

4. 9My;ve 5 5- 4^1 
t&ftl h Wnd-^ 2 3. 2^1 

RNase H~ Superscript I (2 0 0U/^ 1) 

15. 0^1 

m&tm : 1 4 2 . 5^1 

[0 0 5 8] 

M^M*^ I 4 0lC4m 5 0t2m 561C60^ 

ffiMt Lt^RNA^fflv^f^tii, M^M^n*, 4 0t2m - 
o . i v/fr<D?$%imm-e 3 5 ■C'x 5ot2 ft-m^ 5 6 v 6 0 

&&V^ ^^^AT'^-fv- (dNg> N»iffi*10^^V'*^-K) TcDNA 
*^9^Ai"*J»^cti. 2 5"C 0 
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[0 0 5 9] 

•1-1. 5/zl<9[«-32p] dGTP 
[0 0 6 0] 

4 0/i 1 <7)A + Bf!^£^-^C&c#-f 0 

, 4 oxm^-h^t-- 1 ; y>r*fs *)o 

[0 0 6 1] 

?BJK* S 4 2 < C^att7^B#, SiS^^- j.-^A t b &<> 
4 0 // 1 ©A + BM^fa-'/CCito 

[0 0 6 2] 

yn h 3-;vb : GCI-h U^n-^- v;vt£ h -;WCJ:£4&1 
mRNA 5-25/ig 

^RNA 50/ig \>XT (^I^D h 3 ~ >KZ>*£-) 

ttSf-icDNA^^fv- (2/ig///.!) 14/tg (7/* 1) 

:22//l 

2X GC I (LA Taq) buffer (TaKaRa) 75/i 1 

d AT P, dTTP, dGTP, ;V-dCTP, #1 OmM 
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4 /* 1 

4 . 9Mv;nf 2 6 M l 

IfelDhV^n-^ (1^18 0%) 10/*1 
Superscript I I HE?H ( 2 0 0 U//* 1 ) 1 5 /z 1 

ddH 2 0 4//1 

: 12 8/il 

. a -3 2p_ dGTP i m 5/u ! 

4 2 r 3 o frm^ sotio 4 v^fcmm 

[0 0 6 3] 

1) ^jl-^A, BMCH-7;i/t^f^7-Cit 0 

2) *M *i;^£!§$&-tJS 0 

3) zftJKtf*4 2 < C&ciiL£B#, MCfa-7AtB^^t^o 

4) 4 0// 1 OA + BlM*f-a- , /Ci:fto 

5) t-TW^7-0-tff>f >^lw^oTfft>^r*o 
ftftl^ i&«U£l 0mMOEDTA»DLtE£^Mt-5o 

^ [a 32 P] dGTP<9JRl)&#.£:£M5eU c D N A OijXfi Srft#"t' & 

[0 0 6 4] 

3. i-icDNA<0CTABCJ:-5itt 
CTABilli, RNAi§§iWBv^*>0£|WfDo 
CATB^(i^»Jl tl^Do 
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3/uKDO. 5M EDTA (MtlOmM) 

0 c DNASJ£«Fftl 2 8-1 4 2 ^1 fc&TO & 0«:Tgiraf *o 
3 2^1 ©5MMt h USA (R N a se7'J-) 
3 2 0 ^1 OCTAB-IfM (RNAttffiK/Bv*£*><Z>t|S!C) 

1 5 0 0 0 r pm-ei 0*Kat*t* o 

100/^ GuCI "C^it^< i«?f 1"& 0 

2 5 0/ii o^^y-;v*^iBt, 7fc_hXf±-2 o/-8 or-e3 0-6 

15000rpm-ei O^HS^So _t?fr „ 

^V>T\ h£8 0 0 /z 1 tf>8 0 %x^y-;vc 2 HJSfc*£-t& 0 cD 

NA^V7 h-ffc$tL*ffilltR*rflioa5^8 0%^-^ 15 0 0 

0 r pmt3^Mt^)o 

cDNASrTkKS^-f £ (TEIIiTr i s JBv>fcv>) 0 4 6 

A* 1 

[0 0 6 5] 

• limmi-vw&mffim, pH4. 5 

• lM^x>if«> pH6. 0 

• Na l 0 4 ?#t > 1 0 OmMo Mfc^MMLtzk <D<Dfr%&m o 

• S D S 1 0 % 

• V*?->itmffi%L: 3 3mM^x^tf^A, pH6. 0, S.^0. 3 
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3 % S D S 

• ^ xyf/S D SHIt 1 0 mMH* > t K7 nv^r-A (1 

0 n g a r m) 

(MW=3 7 1. 51 ; 3. 71mg/ml = 10mM) 
^r^ry^Olf^^Wk: (A) mRNAOyi-;Vi©tft 
**B|s9f5 0-55^ 1 fcfc* JftTO^O^iJPt^o 

3. 3 (x 10 1M^x>StfV>A«il pH4. 5 
SH-jiJgl OmM©, fcrfcfcWJjRStifcNa 1 0 4 ^ 
7jc±tr4 ^a^- m-&„ 

tftK^OJ: ? i:UcDNA*itRH4. 

0. 5// 1©10%SDS, 11^1<^5M NaC 1 /x 1 <D<{ V7u 

-2 0/- s ovxmmttx3 ofrm&w-fZo 

1 5 0 0 0 r p m*e 1 5 frm$L>b'+2>o 

8 o%^?y-)i'5 oo// i *mnir2>o 

1 5 0 0 0 r p mt' 2-3 55f^iI'k1"&o 

ami 2) - 1 3) zmt)&-to 

c DNA& 5 0 fx 1 <DtK^P ^WMM^ ^> o 

tr*^wb: (b) mitPt- )^m<Dmm 

cDNA (50//1) ^^#^(0 1 6 0 fx 1 <Dfcf*?->t K9S>Ko> 

rT-&mm*imz.2>o rjb*2 io M i i-mmm) xn? 0 

(22~26t:) x-^ xi±3 7x:x3-A^m^>^^-y-r^>o 

7 5 l <D lM^xyft h «; ^a, P H6. 1 

ffiIE#2 003-3072722 
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5 p 1 C05M N a C 1 

7 5 0^1 Oft***.*-/-^ 
&±-? 1 iSWXti- 8 0/-2 01CT3 OfrJKLt-f &o 

1 5 0 0 0 r pm-Cl 0 frWMfct & ° 
ifcR£7 0%3Ui8 0%i^/ / -^t2[iIML, «^*o 

cDNAfc 1 7 5 ^ 1 ©TE (ImM Tris, pH7. 5, 0. ImM E 

dt a) Kmrn-tZo 

RNase ONE (P r ome g a*±S0 R^<DKJBB^Wi 
cDNAt^^W:^TO^^f|^t2 0 0 ju 1 K%2> X o \zMx-ko 
20/i lORNase lift (Promega) 

m— He DNA^lciv^fflfmRNAXI^RNA (^*7°n h a -from 
#) 1 g^/d9 l^fccDRNa s e I (Promega, 5X!ilOU/// 
1) o 

3 7 0 CC3 0 53^ M"&o 

KJ&*ffcihi"*fc** y-y*7)V*fc±KW%, &T<vi>o*mz-2>o 

4ftl<Dl 0%SDSXU r 

3/< 1010/i g/V 1 ^nf'ft-b'Ko 

4 5ttl 5#HK M"&o 

1 : 1 T r i s-¥fft7xy-;v : ^nn*;VAfl IHJftaj U tME £M i 
crocon-100 i:n- rt&o 

7k-CS£3tiaJ fcfi^ #JgM icrocon-Centricon 1 007^V 
M i c r o c o nftM* 1 Elff 9o 

8-b) ^7>*20^1O0. lxTEt'^Clit^o 
[0 0 6 6] 

^ e - x 7 n ? * > y 
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-v-:WOCPG}±$!0 

• SS^®* : 4 . 5M NaCl, 5 0 mM ED T A, pH8. 0 
1 . 5ml <D^3.-7Z$mir2>tzlt><Dm&X? > V&'&We&Zo 

t RNA (lOmg/ml) fc^flN S'**'*- h 

#7-f77 'J-Cov^t, 5 0 0// lOlftlf-X »mRNA2 S fx gMtz 
h§^§m#L&^<b7fc±-e3 0^M>f btio 

500^1 ^^ilgffiifc-c 3 mm&-tz>o 

[0 0 6 7] 

5' ifflcDNAmmtzmm 

£££cDNA**t$£-f TOi^atRNa s e I-McD 

1 ) fcf-Xfc 5 0 0// 1 OiM^I»t:Wt^o 

2) e^-^WkH— Sftc DNA*^tp^*-y^3 5 0 ^ IW^W. 

3) t?>o<HiPU^ 5ot;tiom fi-^@^^o 

4) Iftf^f-SlcDNA^S 5 0 1 OtT-X^^tff-i-rtC, 1 

5 0// 1 otr-X£#-f 0 

5) *9>o< 5 0*0^2 0#W, f*-r*i«SS*4, 
4o a^M«0. 5mim. 
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0. 3M NaCl/lmM EDTAt?llII 

0. 4%SDS/0. 5M NaOAc/20mM Tris-HCl pH8 
. 5/1 mM EDTA-C20 

0. 5M NaOAc/lOmM Tris-HCl pH8. 5/lmM E 
DT At? 2 0 

7Tvf} y ) }:J:^IcDNA»e-X^f)«li 
100^1O50mM NaOH, 5 mM EDTA£#Dx.& 0 

«tttr-XS:^«IU ^HL/ic DNA^tK±^#1- 0 

100^ 1050mM NaOH, 5 mM EDTA*C£*$&f£^M*A'£ 2® 
JSLtiftOiEU 8 0~9 0%OcDNA (^^K^VK^^-ti^Wc 

5' *a&7 p 9^v-fc**gptfcOcDNA^O#iHI 
R N a s e 

RNase ONE (P r o m e g a) 

f-j.-^C7X±-eiM Tris-HCl, pH7. 0 £jjPx.> 



1 ju 1 <7)RN a s e 1(10 U//* 1 ) SriD*., 3*t^l£"f &o 



RNaseI*R*t*^> c DNA^^nf^f-t'KT«U ^Mfcti 
3// %<D?V u-Yy*mifrtrZ>o cDNA^lt^f^DWMi cron-10 

0 (^p n p^) T»*tai-*o 
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• Ame r s h am-P h a r ma c i a S-4 0 OXfcfV^ry h (XtilRl 

*7Alfi: 10mM Tris, pH8. 0, ImM EDTA, 0. 1 
% S D S , lOOmM N a C 1 

SDS^I ^v^7Aif| : 1 OmM Tris, pH8. 0, ImM 
EDT and 10 OmM N a C 1 
S-4 0 0^li>*7A^n7f^77'f- 

4 0 OX if , /*7A©I^7'n h n-;VT^§o 

*7A^sm c 

3 0 0 0 rpmtl*IB«W (+4°C) 
c DNA (faffi <2 0 M 1 ) £i»t&o 
0 fx 1 O^C*j^Sni"*o 
3 0 0 0 r pmf2MWo 

Micron 100 G8jn° n £) t'iltSWiW V >/U^V -*T3fc|R3* 

[0 0 6 8] 

6. SSLLM 

S-3 0 0^t'y*7A^nv f^'77'f - ^7^ (Ame r s h am-P h 
armac i a) o 

*7AlfI: 10mM Tris-HCl pH8. 0, ImM EDTA, 
0. 1 %SDS, 10 OmM N a C 1 
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TaKaRa DNA'J^yM L 
difcSttt&mfcfl-efi, WRMfB g 1 II, G s u I and Mm 

e itmxzti&o 

fcZtiZ^Witte^o g 1 I I (fljra$& : a g a t c t ) It, 

~y Viz t<? XMM& 2>^> K* ^ i/7--trK3c&1~& C ii^-e^^o £ 
OJ:^&ii^t<04t!l<^£ LTfi. AscI (S»«P^G G C G C G C C) RVX 
ba I (I«:TCTAGA) . ^iJi^^o 
Gs u ItUraft^tN TOt'J^^Vtf K^t^o 
rf^ ^ KB g-G s u-GN 5 : 

5' -lfi"^>- agagagagaactaggcttaataggtgac 
tagatctggaggnnnnn-3' 
t'J^^l/tf KBg-Gsu-GN6 I 

5' agagagagaactaggcttaataggtgac 
tagatctggagnnnnnn-3' 

5' P-gaattctcaggactcttctatagtgtcaccta 
aagtctctctct c -NH 2 3 ' 

t'J^^Vtf KBg-Mme-GN5 : 

5 » -tftf '/-AGAGAGAGAACTAGGCTTAATAGGTGAC 
TAGATCTTCCRACGNNNNN— 3' ; 
t'J^^Vtf KBg-Mm'e-N6 : 

5' -^/-AGAGAGAGAACTAGGCTTAATAGGTGAC 
TAGATCTTCCRACNNNNNN— 3* ; 

V rf;* ^ V^-^ KB g -Mm e - d o w n : 
5' P — GTYGGAGATCTAGTCACCTATTAAGCCTAGTT 

CTCTCTCT — NH2 3' . 
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NH 2 iir^ **#in L-ciMNiaEttft^r tr v y 9 >f 5 > ^& Rfrik t r v> 

*ifc«r*i*UTV^ 0 ^V^^V-^^-Kti, il c DN AOft^i EJtfcfc: 
, 1 0 %*° V 7 9 V )VT \ YY)Vn>%mh-?mm.~$'Z> (Sambrook an 
d Russel, 2 0 0 1) o tU^^WKIi, H c DNAOi^ 

[0 0 6 9] 

OD^fx -y * Lfc^ Bg-Gsu-GN5. Bg-Gsu-N6W ^ 

t'j^^i/tfK^4 : 1 : s<om&i*m&tz>o z<oWi&, dnai 

< 2/i g/> 1 tcf^o NaCUlO OmMWfW 

tiw^o fry^^v^Kwu 6 5t;-e5m 4srt5m 371c 
•eiom 2 5tT-i0M7--;vn o 

[0 0 7 0] 

i-icDNAWlSjg 

V>*-t cDNAtl^n (ftif^m; 5^1) 
V yti-Wc DNA^^±i:#to 

TAKARA DNA7^-ya^-; hJ&»fc©»iRl I £ 5 ,u £o 
* y h 0» I £ 1 0 // 1 y»1"&o 

lOt-C-^V^ra^hn (4>&< t 1 0B#Wm±) 0 

7^'-ya ^RJ&Oft&fc, 1//1O0. 5M EDTA, 1^1010%S 

DS, l/<l©10mg/miyn7O-b% 1 0 ^ 1 Ojfc&JniU 45t 
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1) ^^A^iScHI^O (vb'Jy^^mWiit^Jt) , ^T^^ 
3) ^^A^M^^tB-r^o 

2 m i (Djjy&wmwiitmTL, m.tsK*L K) 2®#m-r^o 

*7A^ioo^ i (DmffiWLttimz., >k^-?A o o x g-c2^jt^^ 0 mm 

Ltztm*?-x-vftho *>U JidUI (10 0^1) 

ai^a^iraft* i: ra d k * & s -c - ox*i * o &t « 

1. 5mlOfa-^^t-y^lff> 15ml^I'tfa-yi:>b7fU 
*7A|;f > £#P;i & o 4 0 0 x g T? 2 M^Lyt &o 

^^fl^i"^ (c DNAOc pm(D8 0 %) 0 
*) c DNAfcifcjRS-fcfco 

fZlcDNA 
Ifil 6 5 "C 5 ft 
1M 2 6 8"C30^ 
JM3 7 2'C10# 
XH 4 +4V 
ir^cDN A<Dtztb<r>Wfe 

c D N A 

6W^nL(OLA-Ta q#'J^7-«$ (T a k a r a) 
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6 70nL©2. 5mM (#) d N T P ' s (Takara) 
0. 570DL^) [ a 32p] dGTP 

— ES-frfc*, S$H-ffl&tf*6 5t:oB»:, £Hg*fcA*L*o 

iT-~?)Vir4 >7 y-<nW&\', 10mM (tKiK) EDTA£JJtlx.&<^K£ 
oTRJfc4H£iLU ynfft- lfKHffiI^J:oTRie*i*LV^cU 7x7- 
)V-?uufr)U&ffimRXJ ! ^?y--frifcWL : fct : f do Sambrook and 
Russel, 2001, Molecular Cloning, CSHL p 
r e s s, NYtio 

[0 0 7 1] 
c DNAOM 
*: C DNA^^7^I I sffllRBSreiBSf-rao 

itt (10z) (MB I Fermentasttl) 10^1 
Gsul (IU/^1) (5U/^g DNAOT) Y ^ 1 

d dH2 O X ^ 1 

ii:£H£fI 10 0^1 

ClCl-e, Y^XiicDNAOf^t^o 

1 ) 3 7 0 CTimfflJ >^r*^- h-t^o 

2) 2^100. 5M EDTA^^ijni"^>o 

3) 6 5t)tl 53*ffi-f y**.'*- h LT***^Sfl:1"4o 

&fi;LT, CPG-MPG f l/^h7^>*««ttHtt^^) 

1) 2 0 0^1^GPG-^^i§n o 

2) 5yug<0tRNA (20mg/ml) *m\l1r2>o CltUC J; *)> cDNA 

3) gffl-ei 0-2 05*WXtt*±-C3 0-6 0^, t g if SKfttft^ 
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4) tf-X£?&14X* V KtC»LT3 45-Wfla*U jJcfflfc&T&o 

5) 1M NaCK lOmM E DTAt 3 Mffc&'i'&o ^ ti> If — X 

<D®Mim&-tv>ifc&fc * ffl ^ * o 

6) tf-X(7)^#^i:[Wl#«<7)lM NaCK 1 0 mM EDTA^tf- 

[0 0 7 2] 

7. jfiiUcDNAi'^ 

1 ) If — Xt G s u I > * W&t %> o 

2) i:lt'i^o< qm&L&tfbWSm 5flf(TW M"&o 

3) itt7-^i:3^rait^o 

4) _Wt£IIIW&o 

5) 1 xBSA (?~>lfiL7i7;V7^ **0 5 0 0 ^1^)1 xB&Wt4| 

6) 2 0 0/z 101x'J**-«t (NEB) 1?2mm&t2>o 

[0 0 7 3] 

1) 20^1<7)LoTE (ImM Tris, pH7. 5, RXfO. 1 mM 
EDTA) yiJ- I I £2rax.£ (0. 4/c g//£ 1) o 

2) f-^.-y%6 5tt5»iu m^-ei 5^»e-r^>o 

3) TaKaRa7^-ya^7fI I«I I 2 5 /i K fc 

50^1 Jnx.-?>o 

4) i ett-^r/^^-htJo 

5) ^^_~ >g ^ 5 0 0j«lOlxBSA^tlxB&Wift-e4i 

6) 2 0 0^1^1xB&WlfItllI, 2 0 0 J «lOlxBSA#tlx 

b g 1 1 1 rnmwix- 2 ®m&-?& 0 
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[0 0 7 4] 

L o TE Xfx 1 

1 o nBrnWi 10^1 

BgUI Y fx\ 

oo^i «t 9 
i) ^#^fi:^o< «£-l&^£>3 7x:x-\mm^ y*?.^- m-& 0 

3) LoTEt#i^2 0 0 ^1 Km*?-?* 

2 o o fx \<D^y^)v (v >*-%mimztifz5' jfcffiMfr) K&Toi> 

1) 133// 1(7)7. 5M NH 4 O A c 

2) 3 fi 1 Ol // %/ [x 1 ^l? 

3) 3 4 0^ lO'fVWy-A' 

-2 ot/-8ow*< t % 3 o^W'f ht^o «*af'M»-c 

4t:-ei 5 0 0 0 r pmt*2 0^M3t^1"*o ±Yt£l&*1"&o h£7 0 

%Xti8 0 ^"£2 EltSfeSM" &o 1 5 0 0 0 r p m"? 3 HJ^'fr U ^ 

y'-^iliiitSo Wtfk\~> 1 0// 1 <7)L oTE^CSSt^o 
[0 0 7 5] 

1 0. 9r<Dm&\zZ2> 9 J9ir (di tag) <D%m 
cDNA»5' *»IIStty^^Mt^ 0 SSSfrttftO^ 

1) TaKaRa7^-y3^y h I I & 1 0 // 1> « I £ 

2 0^1 Jnx^o 

2) i6t;-e-^^^ft^o 

3) 1 0/* 1 <7>d dH 2 0> 1/; 1O0. 5M EDTA, 1 fi 1 <0 1 0 % S 
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4) 4 5*C-ei 5 M~&o 

5) 1:1 t r i s -^ffiiky ^ y - : ^on*MT* 1 UttftiHU 7jc^g 

-^no*^/Ajtllili, fiMflHWillJ (S amb r o o k and Rus 
s el, Molecular Cloning, 200 1 #Jg) XlicDNAft 

6) G-5 0^e>A7A (iMXBHfc) ■Cf/h©cDNAifri8fe*t4. 
100/i 1 

6 7/* 1 <7>7. 5M NH 4 O A c 

5 ^ 1 o^ 1 ; 

1 8 0 ft 1 £>>f V/n/NV-JV 

8) 4tt2 0^S^t^o 

9) 8 o%Xt*7 o%^^/-^t^U afrfrU ^^y-;v*l^*i-^o 

cdna(7)77^i i smumm^<o^jm 
mm® (x i o) 5^1 

Gsul (lU/ml) (5U//« gtti) y/^ 1 

dH 2 0 1 
4HS>$Ji (5 + y + x) =50/il 

1) 3 o"c-eiH#iHH m-& 0 

2) 1^ 1O0. 5M EDTA*M4o 

3) 6 5t1?l^ LTBBIf*i5tee***o 

e-x<o*i#, ^t^f77^-t^ 

1) 2 0 0 fx 1 Otf-XK2. 5^ lOt RNA (20^ %/ n 1 ) fcillx.* 

o 
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2) *±t3o^±, m*m.^%j) t t>j>*^-h~f2>o 

3) ^r/M:fU 3frm&Wik, 7ME£|&< 0 

4) W (1M NaCl/lOmM E D T A) 2 0 0 // 1 X~ffi&* 3 0 

5 ) mn Kmn tr-x t iwi c^no^^-e^-r ^> o 

1) tr-xn^m^ i s^ijpi^^atTt^^^^^^o 

2) Pt^^<I€^^i5m gi&TM M"&o 

3) Z?^FK±LX, 2~3frmW:%. ikffi (±M) ZtfL&o 

4) lxB&Wt-e3 0^t^o 

5) L o TEt?l@£fc?#-1-&o 

6) LoTE30j« lClit^o 

1) 3 0 fx IKmmLfz^-y^^KO. 4 p g/fx 1<DV>%-I 1*5 fi 1 

2) 5 2^>^r^-ft^o 

3) 5^$Cffi-f &o 

4) xlOOlfl4/^ li)Dx.l.o 

5) T4 DNAVtf—tr 1 1 *inx.&o 

6) 16tT2m B#^M^^^^f h-t&o 

7) lxB&W*t?3i^U ^^LTV^^#<7)M«m-e2llI^1-^ 



1) 3 7t;-Cll$ISK ^^-FtSo 

2) VK 2 ~33fTO£> Jifil&IIIJDlt&o 

3) LoTE , e«tU#ii2 0 0 ) an:no 

4) 133//1 7. 5M NH4OAC, 3. l^g/ral^J3 



x.i ottftrft 



1 0 fX 1 
X/l 1 

y// 1 

(10 + x + y) =100^1 



L 0 TE 
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5 fx 1 
5 fi 1 



-2 Or/- 8 0*0-^3 O^KJSLt'f hU 15000r p 

2 0 frKSI'k, ^?Hi7 0 8 0 y ->U"CftiSMk, 4^1 Lo 

TEClitl)o 

1) xl07^-y3>iil 0 

2) T4 DNAV**- 1* 0 

3) 1 6t:-e-Hjfe^ V^ra^- ft^o 

lOxExTaq buffer 
DMSO 

2. 5 Mm d NT P s 
3 5 0 n %/ n 1 7"^^- 
c DN A 



E x T a q 
dH2 0 



5ju 
1 6 ju 

X fl 

If 



Y M 

(5 + 3 + 1 6 + 2 + 1 + z + y) =50^ 
9 5TX 2frm<D^ 9 5X3. 3 0$\ 5 5t:. 1#H^ 7 2 

1 ) MUX %tzbf$.*)<7)y-> i>7s>r-)\'T v -fLXUB £ o 
[0 0 7 6] 

7V*-||i:UcDNA« 

•9- y £ 5 // l^LoTE KUMffi U &T<7) & <d *&t<dm&x1mz. 



1 2 
1) 

i) LoTE 

i i ) 1 0 x E c o R 

i i i) E c oR I 



1 

5 /* 1 
1 



ffiffiE# 2003-3072722 



12 0 0 2-2 3 5 2 9 4 w ^-v : 44/ 



mm* 5 o M i tc-r^o 

2) 3 7x:-^iwm4 y^^-^t^o 

3) 1/ulOO. 5M EDTA, 1/z lOl 0%SDS 1 ju 1 GO 1 0 /* 
g/ fx 1 ?UT41—*fK*mkZ>o 

4) 4 5tt'15W/^ra^-fn o 

5) 1 : 1 Tr i s-^$Hb:7^y-;i/:>nn*;i/AT*l|n|»ffiU tME 

o 

100/zi io^-zz-fn/ 

6 7// 1<D7. 5M NH 4 O A c 

5 fx 1 O/'ja-r/ 
1 8 0// 1 0'fV7 p n;V-;i' 

8) 4*C-e2 0#W3£'fc-$-&o 

9) 8 0%X(i7 0%x^;-;K«U »'frU #H & / 

[0 0 7 7] 

11. ^^^HSK^a^^fv-cM 

1) 5 n i ol o TEKnmm-tZo 

2) TaKaRa7^-y3>^f'yM IOjSfftI I * 5 fx 1> I £ 1 
0 // 1 2JDx.&o 

3) 16TC-C1. 5H#F^-f y*x.s<- h-t&o 

4) 1//1<£>0. 5M EDTA. 1 fx 1 <0 1 0 % S D S. 

5) 4 5ttl5^>^^-hn o 

6) i:i T r i s -¥f^7x/-;v : ^nn^VAtl EJ&fflU tMB 



miE# 2003-307272 2 



2002-235294 I 45/ 



o 

7) 5 ^ gO^V n-r^^^^r 1 ; Ti: LTinx., ^y/n^V-^jt^^ 

100/tl <0-9-^-7> 
67^1(07. 5M NH 4 OA c 

5 1 <7)^>; 
1 8 0 fx 1 O'f v^n^v-^ 

8) 4"Ct2 0^^no 

9) 8 o%Xte7 o%*-? ;-)vx*m&\^, Jfrku k^y-;^^ 

10) 5,« 10ddH 2 0(:S§j!|tl.o 
[0 0 7 8] 

«M2 

ifB-e^^ttTt^^^ pBluescript II KS+ (Strat 
a gene) OX ? &^tf-~>^^ <b *l& 0 #ifc<D^n- 

^-lOOngt^l^t^o DNA7^-y3^-yfVer2 (T 

a k a r a) ^I^S^OnL^ * km>&1rZ> 0 

[0 0 7 9] 

Kmm : 

Roche) ^ 2 0mM<Dmti-hVV&RVt8 0%^*J--)VZffiMLtz^ 
DNA£?£Jg:$-£&o 150W^DL<D8 0%^*y-;V 
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yitlY/f^uL^lK Cell-Porato r^£fflv\ E 1 e c t r 
o MAX TM D H 1 OBTMtt (Invitrogen) ^^W^t* ( 
*3e#OV-iT^lC|B1R$*L^BRIEtft^ffe^o^:B i ome t r e r Q M 

[0 0 8 0] 

^Jfc0B3 : I»y-l>y^ 

K^-fv-^fflv^, B i gDy e Terminator Cycle Se 
quencing Ready Reaction Kit v2. 0 (App 
lied Biosystems) R&AB 13700 (Applied Bi 
o s y s t ems) *>-*-V9— &ffiv\ SBt#<a«^&IB#fct£o-Cfr 3 £ 

[0 0 8 1] 

4 : 5 ' iffitm 9 ?<d m& 

§ "!> o 

[0 0 8 2] 

xtfiM 5 : 5 ' ^^se^ij * rcomm^w 

5 ' 5k3§#*fl9E?9* ^fci. Genetics Accelrys Inc. 
©Computer Group (GCG) package (ht tp ://w 
ww. accelrys. c o m/) "CMM nTtEfc. > Y<Dtz#> 

^ Wi^NCBI BLAST (JlIUJ //www, n r.hi. n 1 m. 
n i h. go v/BLAST/) , FASTAO±H«*W4y7h^x7V 
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5' 3fc8MS** , * r *fflt'^ Ge nBa nkfcfcltfcBLASTtfc*©— 
TCtto 16bpW^^ (5* -ACC TCC CTC CGC GGA 
G) ^hTGF-bl (JBC 264 (1989) 402-408)05' 
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Query= (16 letters) (ACCTCCCTCCGCCCAG) 

Database: All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS, 
GSS, or phase 0, 1 or 2 HTGS sequences) 

1,205,903 sequences; 5,297.768,116 total letters 

Score E 

Sequences producing significant alignments: (bits) 
Value 

gill 0863872 Iro.f I NM Q00660.1 I Homo sapiens transforming grow. . . _3 
2 1. 1 

gi II 8590091 Iref I XM 085882.11 Homo sapiens similar to transf . . . _32 
1.1 

gi 111 424057 1 ref I XM 00891 2 1 I Homo sapiens transforming grow... _32 
1. 1 

gi I 7684381 1 gb I AC011462.4 1 AC011462 Homo sapiens chromosome 1... _ 
32 1.1 

gi 1 15027087 1 emb I AL389894.4 1 LMFLCHR4A Leishmania major Fried. . . 
32 1. 1 

gi 1 1 94391 4 1 gb I U70540. 1 I TMIT70540 Leishmania mexicana amazone. . . 
32 1. 1 

gi 1 37097 1 p.mh | X 05839. 1 1 HSTOFBG1 Human transforming growth fa. . . 
32 1. 1 
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ei 1 37092 1 emb I X02812.1 1 HSTGFBI Human mRNA for transforming g. . . 
32 1.1 

ei I 340526 I gb 1 J04431. 1 1 H UMTCtFEIPT?. Homo sapiens transforming . . . 
_32 1.1 

gil 188586flfilreflNM 13173ft 1 I Danio rerio forkhead box Cla (. . . _3 
Q 4.2 

gj 1 1 2004937 1 gb I AP219A49 1 1 AF21 9949 Danio rerio forkhead tra. . . 
M 4.2 

gj | J.93QQ4 1 gb I M13366, 1 1 MUSGPPX Mouse glycerophosphate dehydr. . . 
30 4.2 

gj 1 1 93601 1 gb I MgftfiflS. 1 1 MT7SOPD Mouse glycerol -3-phosphate deh. . . 
30 4.2 

gj I 63465 I emb 1 V00414. 1 I GGHTQ1 Gallus gallus mRNA coding f or . . . 

SSL 4.2. • _ . : 

gj I 634441 emb I X13894:! fQGH&AF Chicken histone I12A. F gene 

_3Q 4.2 

Alignments 

>gi 1 10863872 1 ref i NM 000660. 1 1 Homo sapiens transforming growth factor 
, beta 1 

(Camurati-Engelmann disease) (TGFB1), mRNA 
Length = 2745 

Score = 32.2 bits (16), Expect = 1.1 
Identities ■= 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 

IIIIIHIMIIIII 

Sbjct: 1 acctccctccgcggag 16 



ttiIE#2 003-3072722 



2002-235294 ^ 



^-v: 50/ 



>gj 1 18590091 t ref I XM 085882 1 I Homo sapiens similar to transforming g 
rowth factor, beta 1 (H. 

sapiens) (LOC147760), mRNA 

Length = 697 

Score = 32. 2 bits (16) , Expect = 1.1 
Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 

llllllllllllllll 
Sbjct: 7 acctccctccgcggag 22 

>gi 1 1 1424057 1 r e f | X M 0 09912,1 1 Homo sapiens transforming growth facto 
r, beta 1 (TGFB1) , mRNA 
Length = 2741 

Score = 32.2 bits (16), Expect = 1. 1 
Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 

llllllllllllllll 
Sbjct: 1 acctccctccgcggag 16 

Database: All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS, GSS, 
or phase 0, 1 or 2 HTGS sequences) 
Posted date: Apr 9, 2002 10:59 AM 
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Number of letters in database: 1,002,800,820 
Number of sequences in database: 1,205,903 

Lambda K H 

1.37 0.711 1.31 

Gapped 

Lambda K H 

1.37 0.711 1.31 
Matrix: blastn matrix :1 -3 
Gap Penalties: Existence: 5, Extension: 2 
Number of Hits to DB: 6901 
Number of Sequences: 1205903 
Number of extensions: 6901 
Number of successful extensions: 1479 
Number of sequences better than 10. 0: 16 " 
length of query: 16 
length of database: 5,297,768, 116 
effective HSP length: 15 
effective length of query: 1 
effective length of database: 5,279,679,571 
effective search space: 5279679571 
effective search space used: 5279679571 
T: 0 
A: 30 

XI: 6 (11.9 bits) 
X2: 15 (29.7 bits) 
SI: 12 (24. 3 bits) 
S2: 15 (30. 2 bits) 

Top of Form 
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1: NM_000660. Homo sapiens tran. . . Igi: Related Snamme-. omim. Protein. PutMed. '. 
10863872] my, UniSTS. linftQwt 

LOCUS NM_000660 2745 bp jdRNA linear PRI 13-F 

EB-2002 

DEFINITION Homo sapiens transforming growth factor, beta 1 (Camurati-En 
gelmann 

disease) (TGFB1) , mRNA. 
ACCESSION NM_000660 
VERSION NM_000660. 1 GI: 10863872 
KEYWORDS 

SOURCE human. 

ORGANISM Hnmn sapiens 

Eukaryota;-. Metazoa; Chordata; Craniata; Vertebrata;. Euteleos 

tbmi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 (bases 1 to 2745) 

AUTHORS Derynck, R. , Jarrett, J. A. , Chen, E. Y. , Eaton, D. E , Bel 1, J. R. , 

Assoian, R. K. , Roberts, A. B. , Sporn, M. B. and Goeddel, D. V. 
TITLE Human transforming growth factor-beta complementary DNA sequ 
ence 

and expression in normal and transformed cells 
JOURNAL Nature 316 (6030), 701-705 (1985) 
MEDLINE 85296301 
REFERENCE 2 (bases 1 to 2745) 

AUTHORS Sporn, M. B. , Roberts, A. B. , Wakefield, L. M. and Assoian, R. K. 
TITLE Transforming growth factor-beta: biological function and che 



mical 



structure 
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JOURNAL Science 233 (4763), 532-534 (1986) 

MEDLINE ££261803 

PUBMED 3 4S 7 8 3 1 

REFERENCE 3 (bases 1 to 2745) 

AUTHORS Chang, N. S. , Mattison, J. , Cao, H. , Pratt, N. , Zhao, Y. and Lee, C 

TITLE Cloning and characterization of a novel transforming growth 
factor-betal-induced TIAF1 protein that inhibits tumor necro 

sis 

factor cytotoxicity 
JOURNAL Biochem. Biophys. Res. Commun. 253 (3), 743-749 (1998) 
MEDLINE 891 19079 
PUBMED 9918798 
REFERENCE 4 {bases 1 to 2745) 

AUTHORS Ghadami, ML , Makita, Y. , Ycshida, K. , Nishimura, G. , Fukushima, Y 

Vfakui.K. , Ikegawa, S. , Yamada, K. , Kondo.S. , Niikawa, N. and To 

mita, H. 

TITLE Genetic mapping of the Cainurali-Engelmann disease locus to 

chromosome I9ql3. l-ql3. 3 
JOURNAL Am. J. Hum. Genet. 66 (1), 143-147 (2000) 
MEDLINE 2ni00fil7 
PUBMED 10631140 
REFERENCE 5 (bases 1 to 2745) 

AUTHORS Vaughn, S. P. , Broussard, S. , Hall, C. R. , Scott, A. , Blanton, S. H. 

Milunsky, J, M. and Hecht, J. T. 
TITLE Confirmation of the mapping of the Camurati-Englemann locus 

to 
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19ql3. 2 and refinement to a 3. 2-cM region 
JOURNAL Genomics 66 (1), 119-121 (2000) 
MEDLINE 20304762 
PUBMED 1Q843S14 
REFERENCE 6 (bases 1 to 2745) 

AUTHORS Lim, J. M. , Kim, J. A. , Lee, J. H. and Joo, C. K. 
TITLE Downregulated expression of integrin alpha6 by transforming 
growth 

factor-be la (1) on lens epithelial cells in vitro 
JOURNAL Biochem. Biophys. Res. Commun. 284 (L), 33-41 (2001) 
MEDLINE 2 1268057 
PUBMED U 374 8 67 
COMMENT PROVISIONAL KEESEfl: This record has not yet been subject 
to final ■ 

NCBI review. The reference sequence was derived from X02812 

A. 

FEATURES Location/Qualifiers 
source 1. . 2745 

/organi sm="Homo sapiens" 

/ db_xre f =" taxon : 9606" 

/chromosome=" 19" 

/map="19ql3. 1* 
gene 1. . 2745 

/gene="TGFBl" 

/note=*TGFB; DPD1; CED" 

/db_xref="LocusID :I04fT 

/db_xref ="MIM: 12Q1SQ' 
m jsc feature 37. . 113 

' /not.e=*pot. hairpin loops-forming region* 
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variation 72 

/allele="-" 
/allele="C" 

/db_xref="dbSNP :1BQDBB3" 

vari ation 79 

/allele^"-" 
/allele="C" 

/db_xref =*dbSNP : 1799753" 
CDS 842. . 2017 

/gene="TGFBl" 

/note-* transforming growth factor, beta 1; diaphyse 

al 

dysplasia 1, progressive (Camurati-Engelmann diseas 

e)" 

/codon. .start=l 
/db_xref="LocusID : 7040 " 
/db_xref="MIM :i9JHSQ* 

/product="transfo:rming growth factor, beta 1 
(Camurati-Engelinann disease)" 
/protein_id="NP QQQG514" 
/db_xref="GI: 10863873" 

/transl at i on="MPPSGLRLLPLLLPLLWLLVLTPGPPA AGLSTCKTI D 

MELVKRK 

RIEAIRGQILSKI.RLASPPSQGEVPPGPLPEAVLALYNSTRDRVAGESAEPEPEPEAD 
YYAKEVTRVLMVETHNEIYDKFKQSTHS I YMFFNTSELREAVPEPYLI 5R AELRLLRR 
LKLKVEQHVELYQKYSNNSWRYLSNRLLAPSDSPEWLSFDVTGWRQWLSRGGOIEGF 
RIi5AHCSCDSRDNTLQVDINGFTTGRRGDLATIHGMNRPFLLLMATPLERAQHLQSSR 
HRRALDTNYCFSSTEKNCCWQLYIDFRKDLGWRWIHEPKGYHANFCLGPCPYIWSLD 
TQYSK\T^ALYNQHNPGASAAPCCVPQALEPLPIVYYVGRKPKVEQLShMIVRSCKCS" 
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misc .feature 



2 to 



variation 



variation 



misc feature 



863. . 910 

/note="pot. core sequence of signal peptide (aa -27 

-257)* 
870 

/allele="C" 
/allele="T" 

/db_xref ="dbSNP : 1982073 " 
915 

/allele="C" 
/all eie=*G" 

/db_xref ="dbSNP : 1800471 " 
938. . 1600 

/note="TGFb_propeptide; Region: TGF-beta propeptide 



misc feature 
misc feature 
misc feature 
misc feature 
varia ti on 



mat peptide 



953 

/note="pot. altern. translation start site" 
1035. . 1043 

/note="put. glycosylation site" 
1247. . 1255 

/note="put. glycosylation site" 
1370. . 1378 

/note="put. glycosylation site" 
1632 

/allele="C" 
/allele="r 

/db_xref ="dbSNP : 18004-72 " 
1679. . 2014 

/product="mature TGF-beta (aa 1-112)" 
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beta 



mi sc fe atur e 1715. .2014 

/note="TGF-beta; Region: Transforming growth factor 

like domain" 
misc feature 1721. .2014 

/note="TGFB; Region: Transforming growth factor-bet 



(TGF-beta) family 7 ' 
m i sc f e a t ur e 2018. . 2096 

/note="GC-rich region" 
p romoter 2097.. 2103 

/note="TATA-box-like region" 
migc feature 2517. . 2522 

/note="put. polyadenylation signal" 
p ol y A site 2539 

/note="polyadenylation site" 
BASE COUNT 527 a 938 c 801 g 479 t 
ORIGIN 

1 acctccctcc gcgg agcagc cagacagcga gggccccggc cgggggcagg ggggacg 

ccc 

61 cgtccggggc accccccccg gctctgagcc gcccgcgggg ccggcctcgg cccggag 

egg 

121 aggaaggagt cgecgaggag cagectgagg ccccagagtc tgagacgagc cgccgcc 

gec 

181 cccgccactg eggggaggag ggggaggagg agegggagga gggacgagct ggtcggg 

aga 

241 agaggaaaaa aacttttgag acttttccgt tgccgctggg ageeggagge gcgggga 

cct 

301 ettggegega cgctgccccg cgaggaggca ggacttgggg accccagacc gcctccc 
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ttt 



cat 



cct 



ate 



ttc 



ctg 



cag 



ggg 



ccc 



ggt 



get 



cgc 



cct 



tga 



aat 



361 gccgccgggg acgcttgctc cctccctgcc ccctacacgg cgtccctcag gcgcccc 

421 tccggaccag ccctcgggag tcgccgaccc ggcctcccgc aaagactttt ccccaga 

481 cgggcgcacc cccxgcacgc cgccttcatc cccggcctgt ctcctgagcc cccgcgc 

541 rvtagacnett tctcctccag gagaeggate tctctccgac ctgccacaga tccccta 

601 aagaccaccc accttctggt accagatcgc gcccatctag gllatttccg tgggala. 

661 agacaccccc ggtccaagcc tcccctccac cactgcgccc ttctccctga ggagect 

721 ctttccctcg aggccctcct accttttgcc gggagacccc cagcccctgq : *aggggcg 

781 cctccccacc acaccagccc tgttcgcgct cteggcagtg ccggggggcg ccgcctc 

841 catgccgccc tccgggctgc ggctgctgcc getgetgeta ccgctgctgt ggctact 

901 gctgacgcct ggcccgccgg cegegggact atccacctgc aagactatcg acatgga 

961 ggtgaagcgg aagegcateg aggccatccg cggccagatc ctgtccaagc tgegget 
1021 cagccccccg agecaggggg aggtgccgcc cggcccgctg cccgaggccg tgetege 

1081 gtacaacagc acccgcgacc gggtggccgg ggagagtgca gaaceggage ccgagcc 
1141 ggccgactac tacgecaagg aggtcacccg cgtgctaatg gtggaaaccc acaacga 
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1201 ctatgacaag ttcaagcaga gtacacacag catatatatg ttcttcaaca catcaga 

get 

1261 ccgagaagcg gtacctgaac ccgtgttgct ctcccgggca gagctgegtc tgctgag 

gag 

1321 gctcaagtta aaagtggagc agcacgtgga gctgtaccag aaatacagca acaattc 

ctg 

1381 gcgatacctc agcaaccggc tgctggcacc cagcgactcg ccagagtggt tatcttt 

tga 

1441 tgtcaccgga gttgtgcggc agtggttgag ccgtggaggg gaaattgagg gctttcg 

cct 

1501 tagcgcccac tgctcctgtg acagcaggga taacacactg caagtggaca teaaegg 

gtt 

1561 cactaccggc cgccgaggtg acctggccac cattcatggc atgaacegge ctttcct 

get 

1621 tctcatggcc accccgctgg agagggecca geatctgeaa agctcccggc accgccg 



1681 cctggacacc aaetallgcl Icagetccac ggagaagaac tgctgcgtgc ggcagct 

gta 

1741 cattgacttc cgcaaggacc tcggctggaa gtggatccac gageccaagg gctacca 

Igc 

1801 caacttctgc ctcgggccct gcccctacat ttggagr.ctg gaeaegcagt acagcaa 

ggt 

1861 eclggccctg tacaaccagc ataacceggg cgcctcggcg gcgccgtgct gcgtgcc 

gca 

1921 ggcgctggag ccgctgccca tcgtgtacta cgtgggccgc aageccaagg tggagca 

get 

1981 gtccaacatg ategtgeget ectgeaagtg cagctgaggt cccgccccgc cccgccc 

cgc 

2041 cccggcaggc ccggccccac cccgccccgc ccccgctgcc ttgcccatgg gggctgt 



age 
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att 

2101 taaggacacc gtgccccaag qccacctggg gccccattaa agatggagag aggactg 

egg 

2161 atctctgtgt cattgggege ctgcctgggg tctccatccc tgacgttccc ccactcc 

cac 

2221 tccctctctc tccctctctg cctcctcctg cctgtctgca ctattccttt gcccggc 

ate 

2281 aaggcacagg ggaccagt.gg ggaacactac tgtagttaga tctatttatt gagcacc 

ttg 

2341 ggcactgttg aagtgcctta cattaatgaa ctcattcagt caccatagca acactct 

gag 

2401 atggcaggga ctctgataac acccatttta aaggttgagg aaacaagccc agagagg 

tta 

2461 agggaggagt tGCtgcccac caggaacctg ctttagtggg ggatagtgsia gaagaea 
ata ! '. 

2521 aaagatagta gttcaggeca ggcggggtgc tcacgcctgt aatcctagca ctlttgg 

gag 

2581 gcagagatgg gaggatactt gaatccaggc atttgagacc agcctgggta acatagt 

gag 

2641 accctatctc tacaaaacac ttttaaaaaa tgtacacctg tggtcccagc tactctg 

gag 

2701 gctaaggtgg gaggatcact tgatcctggg aggtcaaggc tgcag 

// 

Bottom of Form 
Revised: October 24, 2001. 

Query= (16 letters) 

Database: GenBank Human BST entries 

4,280,058 sequences; 2,114,234,064 total letters 
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Score E 

Sequences producing significant alignments: (bits) 
Value 

gj 1 19365764 1 gb 1 BM915385.1 1 BM9153K5 AGENC0URT_6701642 NIH_MG. . . 
32 0.41 

g j 1 19353768 1 eb IBM903897.1 1 BM903897 AGENCOURT_6696012 NIHJMG. . . 
32 0.41 

gj 1 18807810 1 gb I BM562052.1 I BM562052 AGENC0URT_6562015 NIH_MG. . . 
32 0.41 

gj 1 18791603 1 gb 1 BM553137.1 1 BM5531 37 AGENC0URT_6572574 NIHJJG. . . 
32 0.41 

gj 1 16171065 1 gb 1 BI908151.1 1 BI908151 603067456F1 NIH_MGC_118 . . . _3 
2 0.41 

gil 15759271 1 gh I BI767693. 1 I BT7676A3 6030,6p648F1 NIH_MGC_122 _2 
2 0.41 

gill5343643lgblBI518851.1IBI518851 603061760P1 NDI_MGC_118 ... _2 
2 0.41 

gj 1 14309343 f gb I BG899094.1 I BG899094 H0A21-1-G9 HOA (Human Os. . . 
32 0.41 

gi 1 13662542 1 gb I BG611 1 71.1 1 BGfil 1 171 602612144F1 NIH_MGC_60 H. . . 
32 0.41 

gi 1 12609210 1 gb I BGl 157041 1 RG1 1 5704 602317174F1 NIH_MGC_88 IL . . 
32 0.41 

gi 1 12101282 1 gb I BF796228.1 1 BF796228 602258513F1 NIH_MGC_85 H. . . 
32 0.41 

gi 1 1115 2079 1 gb I BF238160.1 1 BF2381 60 601811886F1 N1H_MGC_48 R . . _ 
32 0. 41 

gil 11 1 0031 3 1 gb 1 BF206727.1 1 BF206727 601871 105F1 NIH_MGC_19 E . . _ 
32 0. 41 
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sri 1 11100272 1 eh I BF20fi 686.1 1 BF20fif,86 601871051F1 NIH_MGC_19 H. . . 
52 0.41 

gil 16775383 I gb I BM046103.1 IBM04fi103 603625849F1 NIH_MGC_40 E . . 
30 1.6 

gi 1 19,7391 74 1 gb I BQ01 4273,1 1 B Q Q H 273 UI-H-EDl-axs-h-21-0-UI. s. . . 
2S 6.4 

gi 1 1 9378603 1 gh 1 BM928224.1 1 B M92 82 2 4 AGENC0URT_6699855 NIHJIG. . . 
_2S 6.4 

gi 1 193 67 80 8 1 gb I BM91 74 29.1 1 BM 9 1742Q AGENC0URT_6606724 NIH_MG. . . 
_2S 6.4 

gi 1 193642H I g b I BM9 138 35.1 1 BM 9 1 3835 AGENC0URT_6612786 NIH.MG. . . 
28 6.4 

nil 19361343 1 gb I BM910964.1 1 BM910964 AGENC0URT_6615957 NIHJIG. . . 
-2S 6.4 

gi 1 18505954 1 gb I BM456914.1 1 BM456914 AGENC0URT_6404253 N1H_MG. . . 
28 6.4 

gi 1 18499709 1 gb I BM450669.1 1 BM450669 AGENC0URT_6394717 NIHJIG. . . 
J8 6.4 

gi 1 1 6Q0Q1 96 1 g b I B I 85 9449, 1 1 B I8 59449 603388188F1 NIHMGC_87 H. . . _g 
S 6.4 

gi I 159284601 gb I BI81 8193. 1 1 BT81S193 603032663F1 NIH_MGC_115 . . . _% 
fi 6.4 

gi 1 15431547 1 gb I BI544235.1 1 BI544235 603241605F1 NIH_MGC_95 H. . . _2 
S 6.4 

gi 1 15345229 1 gbl BI520437.1 1 BI520437 603071622F1 NIH_MGC_119 . . . _2 
8 6.4 

gi 1 14440373 1 gh I BT033747.1 1 BI033747 PM3-NN0223-220201-014-h0. . . JL 
S 6.4 

gi 1 1 442fifi7fi I gh I BT020046 1 I BT02004fi CM3-MT0291-ll0101-622-f0. . . _2 
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gi 1 1 4081 825 1 gh I Krr770fi72 1 I BO770fi7S 602734012F1 NIH_MGC_49 H. . . _ 
28 6.4 

gi 1 1 3546630 1 gh I BG5479fi5. 1 I BG5479A5 602576071F1 NIH_MGC_77 E . . _ 
28 6.4 

gi I 1 3030375 1 gh I BG281 450 1 I BG281 450 602401966F1 NIH_MGC_20 E . . _ 
23 6.4 

gi I 1 2951 460 1 emh I AL5829S9. 1 1 AT.582959 AL582959 LTI_NFL010_BC2. . . _ 
28 6.4 

gi 1 1 27fi4352l gh I BG25453fi 1 I BG25453B 602368464F1 NIH_MGC_91 H. . . _ 
2a 6.4 

gi 1 12378592 1 gh I BFflfil 31 71 1 BF961 3 17 PM3-NN0223-11 1 200-004- dO. . . _ 
28 6.4 

gi 1 1 2374538 1 gh I BF957263.1 1 BF957263 PM3-NN0223 -241100-002-bO,. . _.. 

28 6.4 " - 

gi 1 13323H4 1 g b I BF92615U ,3, | BF92gl5Q CM2-NT0 1 93-30 1100-562-al. . . _ 
28 6.4 

gi 1 12259862 1 gh I BF869732. 1 I BF869732 IL3-ET0114-251000-316-A1. . . _ 
28 6.4 

gi 1 1 21298941 gh I BF800905.1 1 BF8009Q5 PMl-CI0110-201000-003-fO. . . _ 
23 6.4 

gi 1 12071436 1 gb I BF7447601 1 BF744760 QV2-BT0635-311000-440-cl. . . _ 
23 6.4 

gj 1 1 1770407 1 gh I BE965733.2 1 BE965733 601659792R1 N1H_MGC_70 H. . . _ 
23 6.4 

gi 1 11 7fifiS39 1 gh I BE963121 .2 1 BE9fi3121 601656923R1 NIH_MGC_67 H, . . _ 
2a 6.4 

gi 1 1 0348538 1 frh I BE890328. 1 1 BE890328 601431783F1 NIH_MGC_72 H. . . _ 
28 6.4 
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gi 1 10142985 1 gb I BE728993. 1 I BE728993 601562251F1 NIH_MGC_20 H. . . 
2fi 6.4 

gj 1 10095527 1 gb I BE707262.1 1 BE707262 PMl-HT0452-060700-008-eO. . . 
28 6.4 

gi 1 9772X96 1 gb I BE543551 . 1 1 BES43SS1 601070523F1 NIH_MGC_12 Ho. . . 
28 6.4 

gj I 9768571. 1 gb I BE539926. 1 1 BE539ft2fi 601060667F2 N1H_MGC_10 Ho. . . 
28 6.4 

sri I 9342607 I ah I BF.3fl79.42. 1 I BE3979.42 601290754F1 NIH_MGC_8 Horn. . . 
28 6.4 

gj 1 9332870 1 gfc> 1 BE3875Q5.1 I BE387505 601274247F1 NIH_MGC_20 Ho. . . 
28 6. 4 

gj 1 8140649 I gh I AW950985. 1 | AW9Sflfl«S EST363055 MAGE resequence. . . 

28 6.4 • • 

gj i 8139665 1 gb I AW950T29. 1 1 AWflfim 9.9 EST362094 MAGE resequence. . . 
28 6.4 

gj 1 6879658 1 gb I AW375004.1 1 AW37KOfU MRO-CT0068-280999-002-f 07. . . 
28 6.4 

gj I 5435227 1 emh I AL079fi51 J I AT,()7ftfiS1 DKFZp434N0629_rl 434 (sy. . . 
28 6.4 

gj 1 5406349 1 emb I AkQ368611 I ALQaSSfi] DKFZp56401963_rl 564 (sy. . . 
28 6.4 

gi 1 2566893 1 gb I AA641675 ,! I A A6 41675 nr62g01. si NCI_CGAP_Lym3 . .. 
28 6. 4 

gj 1 2080087 1 gb I AA418268.1 1 AA4 18268 zv96d09. si Soaxes_NhHMPu_. . . 
28 6.4 

gj I 2056455 1 gb 1 AA402650.1 1 AA402650 zu49g06. rl Soares ovary t. . . 
28 6.4 

gil 15163981 gb I AA0401021 I AAfUOIOS zk46e02. rl Soares_pregnan. . . 
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28 6.4 
Alignments 

>gj I 19365764 1 eh I BM91 5385.1 1 BM91 MRS AGENC0URT_6701642 NIHMGC.41 
Homo sapiens cDNA clone 

IMAGE: 5481560 5'. 

Length = 1086 



Score = 32.2 bits (16), Expect = 0.41 
Identities = 16/16 (100%) 
Strand = Plus / Plus 



Query: 1 acctccctccgcggag 16 

IMIIIIIIIII III 

Sbjct: 23 acctccctccgcggag 38 

>gi 1 19353768 1 gb I fi TVT903897. 1 1 BM9fl3892 AGENC0URT_6696O12 NIH_MGC_67 
Homo sapiens cDNA clone IMAGE : 5492392 
5'. 

Length = 1497 

Score « 32.2 bits (16), Expect - 0.41 
Identities = 16/16 (100%) 
Strand = Plus / Minus 



Query: 1 acctccctccgcggag 16 

imimiiiiim 

Sbjct: 445 acctccctccgcggag 430 



>gi 1 1 880781 Qlghl BM563052, 1 1 BMfffi20ft2 AGENC0URT_6562015 NIH_MGC_118 
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Homo sapiens cDNA clone 

IMAGE: 5745414 5'. 
Length = 1175 

Score = 32.2 bits (16), Expect = 0.41 
Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 

Miiiiiiiiimii 

Sbjct: 20 acctccctccgcggag 35 

> gj 1 18791603 1 gb 1 BM553137.il BMSS3137 AGENC0URT_6572574 NIHJGC.41 
Homo sapiens cDNA clone? -. 

IMAGE: 5467063 5'. 

Length = 1100 

Score = 32. 2 bits (16) , Expect = 0. 41 
Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 

1 1 1 1 M 1 1! 1 1 1 1 ! M 

Sbjct: 26 acctccctccgcggag 41 

> gi 1 1 6171065 1 gh I BI908151.1 1 BI908151 603067456F1 NIH_MGC_118 Homo sa 
piens cDNA clone IMAGE: 5216508 5'. 
Length = 706 
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Score = 32.2 bits (16), Expect = 0.41 



Identities = 16/16 (100%) 



Strand ~ Plus / Plus 




Sbjct: 25 acclccctccgcggag 40 

> gil 15759271 I gh \ "RT787693. 1 !BT7fl7fi93 603060648F1 NIH_MGC_122 Homo sa 
piens cDNA clone TMAGE: 5209978 5'. 
Length = 862 

Score = 32.2 bits (16), Expect = 0.41 
Identities = 16/16 (100%) /. . 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 



Sbjct: 705 acctccctccgcggag 720 

> gj 1 1 5343643 1 gh I BI5188511 I BT518851 603061760F1 NIHJtGC_118 Homo sa 
piens cDNA clone IMAGE: 5210943 5'. 
Length = 943 

Score = 32.2 bits (16), Expect = 0.41 
Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 



miiiiiiiimii 
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Sbjct: 25 acctccctccgcggag 40 

> gj 1 14309343 1 gh I BO899094. 1 I BftreofMU H0A21-1-G9 HOA (Human Osteoart 
hritic Cartilage) Homo sapiens 
cDNA. 

Length =364 

Score = 32.2 bits (16), Expect = 0.41 
Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 

• "iiiimiiiiiiiii • - - : , 

Sbjct: 83 acctccctccgcggag 98 

>gj 1 13662542 1 gb I BG611 171 1 I ROfil 7171 602612144F1 NIH_MGC_60 Homo sa 
piens cDNA clone IMAGE:4737466 5'. 
Length = 897 

Score = 32.2 bits (16), Expect = 0.41 
Identities = 16/16 (100%) 
Strand = Plus / Minus 

Query: 1 acctccctccgcggag 16 

- imiimimiii 

Sbjct: 809 acctccctccgcggag 794 
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>gj 1 12609210 1 gb I BG1 15704. 1 1 RG1 1 5704 6023 171 74F1 NIH..MGC .88 Homo 
piens cDNA clone IMAGE: 4417482 5'. 
Length = 838 

Score = 32.2 bits (16), Expect = 0.41 
Identities - 16/16 (100%) 
Strand = Plus / Plus 

Query: I acctccctccgcggag 16 



Sbjct: 51 acctccctccgcggag 66 

>gj 1 121012821 gb I BF79fi2281 I HF79fig2ft 602258513F1 NIH_MGC_85 Homo sa 
piens cDNA clone IMAGE r-4341962 5'. 
Length = 1081. 

Score = 32.2 bits (16), Expect = 0.41 
Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 

IIIMIIIIIIIIIII 

Sbjct: 7 acctccctccgcggag 22 

> gi 1 1 1 1 52079 1 gh 1 BF238160. 1 1 BF2381 fiO 60181 1886F1 NIH_MGC_48 Homo sa 
picns cDNA clone IMAGE :4054821 5'. 
Length =811 

Score = 32.2 bits (16), Expect = 0.41 



Nil! IIIIIMIII 



ffiSE# 2003-3072722 



2002-235294 ®* v : 70/ 



Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 

ii milium i 

Sbjct: 11 acctccctccgcggag 26 

> gil 11100313 I gh 1 BF20fi7271 1 BF30fi7? r 7 601871 105F1 NIHJCCL19 Homo sa 
piens cDNA clone IMAGE:4101600 5\ 
Length = 888 

Score = 32.2 birs (16) , Expect = 0.41 
Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 

lillil MINIM 

Sbjct: 32 acctccctccgcggag 47 

>gi 1 1 1 100372 1 gb I PF306 686 . 1 1 BF2Q 6686 601871051F1 Nil] MGC_19 Homo sa 
piens cDNA clone IMAGE:4101517 5'. 
Length = 917 

Score = 32.2 bits (16), Expect = 0.41 
Identities = 16/16 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcggag 16 

M ] 1 1 1 1 1 i 1 1 Mil 
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Sbjct: 33 acctccctccgcggag 48 

>gi 1 1S77S38 3 1 gb I BM046103.1 1 BMO46103 603625849F1 NIH_MGC_40 Homo s 
apiens cDNA clone IMAGE : 5452309 5'. 
Length = 869 

Score = 30.2 bits (15), Expect -1.6 
Identities = 15/15 (100%) 
Strand = Plus / Plus 

Query: 2 cctccctccgcggag 16 

lllllllllllllll 

Sbjct: 692 cctccctccgcggag 706 

> gi1 197391741 gh I BQ014273.i.iBQO 14273 UI-H-EDl-axs h-21-O-UL si NCI_ 
CGAP EDI Homo sapiens cDNA clone 
IMAGE: 5833028 3' . 
Length ~ 772 

Score = 28.2 bits (14), Expect = 6.4 
Identities - 14/14 (100%) 
Strand = Plus / Minus 

Query: 2 cctccctccgcgga 15 

Minimum 

Sbjct: 495 cctccctccgcgga 482 

>gj | } 9^?ftfi03 1 gh I BM928224. 1 1 BM928224 AGENCOURT 6699855 NIH_MGC_121 
Homo sapiens cDNA clone IMAGE : 5770072 5'. 
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Length = 1140 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 2 cctccctccgcgga 15 

niiiiiiniii 

Sbjct: 1009 cctccctccgcgga 1022 

> gj 1 1 3367808 1 gb I BM91 7429. 1 1 BM91 749ft AGENC0URT_6606724 ?fIH_MGC_106 
Homo sapiens cDNA clone IMAGE : 5483947 5'. 
Length = 1073 

Score = 28.2 bits (14) ;\ Expect =6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcgg 14 

Mill IMI'll 

Sbjct: 916 acctccctccgcgg 929 

>gj 1 1936421 4 1 erb 1 BMfll 3835. 1 1 BMftl 3835 AGENC0URT_66 12786 NIH_MGC_98 
Homo sapiens cDNA clone IMAGE: 5477539 5'. 
Length =1104 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Minus 
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Query: 2 cctccctccgcgga 15 

Minimum 

Sbjct: 842 cctccctccgcgga 829 

>gj 1 19361343 1 gb I BM91 09fi41 1 BM910flfi4 AGENC0URT_6615957 NIHMGC.98 
Homo sapiens cDNA clone IMAGE: 5454547 5'. 
Length =1128 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Minus 

Query: 3 ctccctccgcggag 16 . 

1 1 1 1 nil 1 1 1 1 1 1 

Sbjct: 883 ctccctccgcggag 870 

> gi 1 185059541 gb I BM4569I4.1 1 BM4SfiQl4 AGENC0URT_6404253 NIH_MGC_92 
Homo sapiens cDNA clone 

IMAGE: 5583862 5'. 

Length = 1813 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 2 cctccctccgcgga 15 

r 1 1 1 1 f 1 1 1 1 1 1 1 1 

Sbjct: 29 cctccctccgcgga 42 
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> gj 1 18499709 1 gb I BM4506G9.1 1 BM4S0fifi9 AGENC0URT_6394717 NIH_MGC_67 
Homo sapiens cDNA clone IMAGE: 5494366 5'. 
Length = 1430 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcgg 14 

1 1 1 1 1 1 1 1 r 1 1 1 1 1 

Sbjct: 1150 acctccctccgcgg 1163 

> gi 1 16000196 1 gb I BI859449 1 1 BI859449 603388188F1 NIH_MGC_87 Homo sap. 
iens cDNA clone IMAGE: 5396997' 5' . 



Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcgg 14 

llllllllllllll 
Sbjct: 100 acctccctccgcgg 113 

> gi 1 15928460 1 gb I BI818193.1 1 BI818193 603032663F1 NIH_MGC_115 Homo sa 
piens cDNA clone IMAGE: 5173838 5'. 
Length = 683 



Length = 852 
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Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 2 cctccctccgcgga 15 

II III !M I II Ml 
Sbjct: 96 cctccctccgcgga 109 

> gi 1 1 5431 547 1 gh I BT544235 1 1 BTR44235 603241605F1 NIH_MGC_95 Homo sap 
iens cDNA clone IMAGE: 5284296 5'. 
Length = 676 

Score = 28.2 hits (14), Expect = 6.4 
Identities = 14/14 (100%) • 
Strand = Plus / Minus 

Query: 3 ctccctccgcggag 16 

llllllllllllll 
Sbjct: 39 ctccctccgcggag 26 

> gj 1 15345229 1 ¥ h I BI5204371 1 BI520437 603071622F1 N1H_MGC_119 Homo sa . 
piens cDNA clone IMAGE: 5163773 5*. 
Length = 727 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Minus 

Query: 1 acctccctccgcgg 14 
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llllllllllllll 
Sbjct: 505 acctccctccgcgg 492 

>gj I 14440373 1 gb I BI033747. 1 1 RT0ftft7d7 PM3-NN0223-220201-014-h04 NN022 
3 Homo sapiens cDNA. 

Length = 284 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Pins / Minus 

Query: 1 acctccctccgcgg 14 

IIIIIIIIIIIIM 
Sbjct: 97 acctccctccgcgg 84 

> gi I 14426676 1 gb I BI0200461 1 Br020fUfi CM3-MT0291-110101-622-f04 MT029 
1 Homo sapiens cDNA. 

Length = 436 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Minus 

Query: 1 acctccctccgcgg 14 

llllllllllllll 
Sbjct: 365 acctccctccgcgg 352 

> gj 1 14081325 1 gb ' BG770672. 1 1 BG770672 602734O12F1 NIH_MGC_49 Homo sa 
piens cDNA clone IMAGE: 4859546 5'. 
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Length = 949 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcgg 14 

mini -mill 

Sbjct: 63 accLocclccgcgg 76 

>gj I 13546630 1 gb I BG547965.1 1 BGS479fifi 602576071F1 NIH_MGC_77 Homo 
piens cDNA clone IMAGE: 4704209 5'. 
Length - 918 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcgg 14 



Sbjct: 248 acctccctccgcgg 261 

> gj 1 1303Q375 1 gb I RG281 4501 1 TtG281 45D 602401966F1 NIH_MGC_20 Homo sa 
piens cDNA clone IMAGE: 4544201 5'. 
Length = 782 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 



IjlliilMiMII 
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Query: 1 acctccctccgcgg 14 

iiiiiiiiiimi 

Sbjct: 417 acctccctccgcgg 430 

>gj 1 12951460 1 emb I AL582959.1 1 ALK82ftBft AL582959 LTI_NFL010_BC2 Homo 
sapiens cDNA clone CSODL008YA12 3 prime. 
Length = 822 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Minus 

Query: 2 cctccctccgcgga 15 

1 1 1 1 1 i F 1 1 1 I 

Sbjct: 533 cctccctccgcgga 520 

> gj 1 12764352 1 erb I BG254536 .1 I BG2S4SSfi 602368464F1 NIH_MGC_91 Homo s 
apiens cDNA clone IMAGE : 4476902 5\ 
Length = 1031 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 .(100%) 
Strand = Plus / Plus 

Query: 2 cctccctccgcgga 15 

Minimum 

Sbjct: 849 cctccctccgcgga 862 
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>gi 1 133785021 gblBF96I317. II BF961317 PM3-NN0223-111200-004-d03 NNO 
223 Homo sapiens cDNA. 

Length = 277 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Minus 

Query: 1 acctccctccgcgg 14 



Sbjct: 89 acctccctccgcgg 76 

>gi 1 123745381 gb I BF957263.1 1 BES522fi3 PM3-NN0223-241100-002-b08 NNO 
223 Homo:* sapiens cDNA. * ■■, 

Length = 168 ' ' " 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Minus 

Query: 1 acctccctccgcgg 14 



Sbjct: 117 acctccctccgcgg 104 

>gj 1 123231 14 1 gb I BF926150. 1 1 BF92fi 1 50 CM2-NT0193-301100-562-al2 NT01 
93 Homo sapiens cDNA. 

Length = 417 

Score - 28.2 bits (14), Expect = 6.4 



miiMimiii 



uiiiiiiiiim 
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Identities = 14/14 (100%) 
Strand = Plus / Minus 

Query: 2 cctccctccgcgga 15 

IIIIIIIIIIIIII 
Sbjct: 268 cctccctccgcgga 255 

> gj 1 12259862 1 gb I BF869732. 1 I ttFRfig?.^ IL3-ET0114-251000-316-A11 ET01 
14 Homo sapiens cDNA. 

Length = 278 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Minus. =. 

Query: 1 acctccctccgcgg 14 



Sbjct: 73 acctccctccgcgg 60 

>gj 1 12129894 I gb I BF80O9O5. 1 1 BFftOfiflns PMl-CI0110-201000-003-f08 CIOl 
10 Homo sapiens cDNA. 

Length = 283 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcgg 14 



mmiiimi! 



mimm in 
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Sbjct: 211 acctccctccgcgg 224 

>gj 1 180 71436 I gb I BF744760.1 1 BK7447fin QV2-BT0635-311000-440-cll BT06 
35 Homo sapiens cDNA. 

Length = 534 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 



Query: 2 cct.ccct.ccgr.gga 15 

minium 

Sbjct: 319 cctccctccgcgga 332 

>gj > 11770407 1 gb \ BE965733.2 1 BTC9RS733 601B59792R1 NIH_MGC_70 Homo 
piens cDNA clone IMAGE: 3896134 3'. 
Length = 1336 

Score ~ 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcgg 14 

nun mini 

Sbjct: 292 acctccctccgcgg 305 



> gil 117666391 gblBE963121.2IBE9fi3121 601656923R1 NIH_MGC_67 Homo sa 
piens cDNA clone IMAGE : 3865924 3'. 
Length = 1442 
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Score = 28. 2 bits (14) , Expect = 6. 4 
Identities = 14/14 (100%) 
Strand = Plus / Minus 

Query: 3 ctccctccgcggag 16 

inmiiiiiiii 

Sbjct: 403 ctccctccgcggag 390 

>gj 1 1 0348536 1 gb I BE890328. 1 1 BE89039.8 601431783F1 NIHJfGC_72 Homo sa 
pi ens cDNA clone IMAGE: 39 16820 5'. 
Length = 794 

Score = 28.2 bits (14), Expect " 6.4- 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcgg 14 

lllfllllllllll 
Sbjct: 115 acctccctccgcgg 128 

> gj 1 10142985 1 gb I BE728993.1 1 BE728993 601562251F1 NIH_MGC_20 Homo sa 
piens cDNA clone IMAGE: 3831924 5'. 
Length = 840 

Score = 28. 2 bits (14) , Expect = 6. 4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 



2003-307272 
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Query: 1 acctccctccgcgg 14 

niiiiii inn 

Sbjct: 397 acctccctccgcgg 410 

> gj 1 10095527 1 gb I BE707262.ll BE7072R2 PMl-HT0452-060700-008-e08 HT04 
52 Homo sapiens cDNA. 

Length = 592 

Score - 28.2 bits (14), Expect - 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Minus 

Query: 1 acctccctccgcgg 14 



>gi I 977219G I gb I BEB48 B51. 1 1 BF,S4ftSfil 601070523F1 NIHJJGC_12 Homo sap 
iens cDNA clone IMAGE: 3456940 5'. 
Length = 1035 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcgg 14 



> gi I 97fi8571 1 gb I BE539926.1 1 BE539926 601060667F2 NIH_MGC_10 Homo sap 




1 1 1 1 1 1 1 1 f I M I ! 



Sbjct: 332 acctccctccgcgg 345 
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iens cDNA clone IMAGE:3447161 5'. 
Length = 902 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand - Plus / Plus 

Query: 1 acctccctccgcgg 14 

Minimum 

Sbjct: 411 acctccctccgcgg 424 



> gi 1 9342607 I gh I BE397242. 1 1 BE397242 
ens cDNA clone IMAGE: 3621253 5'. 
Length = 524 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 2 cctccctccgcgga 15 

lllllllllllllf 
Sbjct: 228 cctccctccgcgga 241 

>gi 1 9 3 3 2 870 1 gb I BE 3 8 7 SQ 5.1 1 BE 3 S75 05 
iens cDNA clone IMAGE: 3615538 5'. 
Length = 637 



3 5 2 9 4 




601290754F1 NIH_MGC._8 Homo sapi 



601274247F1 NIH_MGC_20 Homo sap 
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Score = 28. 2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus / Plus 

Query: 1 acctccctccgcgg 14 

MIIIIIIIIIMI 

Sbjct: 422 acctccctccgcgg 435 

>gj 1 8140649 1 gb I AW950985. 1 1 AWflsnsss EST363055 MAGE resequences, MA 
GA Homo sapiens cDNA. 

Length = 638 

Score = 28.2 bits (14), Expect = 6.4 
Identities •= 14/14 (100%) "• 
Strand = Plus / Plus 

Query: 2 cctccctccgcgga 15 

llllllilllllll 
Sbjct: 273 cctccctccgcgga 286 

>gj I 8139665 1 gb I AW9 50 129.1 1 AWflfiOlSQ EST362094 MAGE resequences, MAG 
A Homo sapiens cDNA. 

Length = 611 

Score = 28.2 bits (14), Expect = 6.4 
Identities = 14/14 (100%) 
Strand = Plus /Plus 
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Sbjct: 273 cctccctccgcgga 286 

Database: GenBank Human EST entries 

Posted date: Mai- 29, 2G02 2:35 AM 
Number of letters in database: 2,114,234,064 
Number of sequences in database: 4,280,058 
Lambda K H 

L37 0.711 1.31 
Gapped 

Lambda K H 

1.37 0.711 1.31 
Matrix: blastn matrix;:,l -3 
Gap Penalties: Existence: 5, Extension: .2 
Number of Hits to DB: 5013 
Number of Sequences: 4280058 
Number of extensions: 5013 
Number of successful extensions: 5013 
Number of sequences better than 10.0: 61 
length of query: 16 
length of database: 2,114,234,064 
effective HSP length: 15 
effective length of query: 1 
effective length of database: 2,050,033, 194 
effective search space: 2050033194 
effective search space used: 2050033194 



T: 0 



A: 30 
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XI: 6 (11.9 bits) 
X2: 15 (29. 7 bits) 
SI: 12 (24. 3 bits) 
S2: 14 (28. 2 bits) 
Top of Form 

1 : BM915385. AGENC0URT_6701642. . . [gi : 19365764] 



Tnxono 
mv Lin 
kOxit 



IDENTIFIERS 
dbEST Id: 
EST name: 
GenBank Acc: 
GenBank gi: 
CLONE INFO 
Clone Id: 
Plate: 
DNA type: 
PRIMERS 
PolyA Tail: 
SEQUENCE 

CCCG 

CGGG 

CAGT 



11598757 

AGENC0URT_6701642 

BM915385 

19365764 

IMAGE: 5481560 (5') 
LLCM2006 Row: d Column: 09 
cDNA 

Unknown 

CGCCCTGGG(XATCTCCCT(X CACCrCCCTCCGCGGAGC AGCCAGACAGCGAGGGC 
GCCGGGGGCAGGGGGGACGCCCCGTCCGGGGCACCCCCCCGGCTCTGAGCCGCCCG 
G(X:GGCCT(^CCCGGAGCGGAG<}MGGAGTCG€CGA(JGAGCAGCC1'GAGGCCCCA 
CTGAGACGAGCCGCCGCCGCCCCCGCCACTGCGGGGAGGAGGGGGAGGAGGAGCGG 



GAGG 
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CTGG 
TGGG 
CACG 
CTCG 
CCTG 
TCCG 
GCCT 
TCCC 
CCCT 
CAAG 
ACCG 
CCCG 
CCTC 
GGGG 



AGGGACGAGCTGGTCGGGAGMGAGGAAAAAMCTTTTGAGACTTTTCCGTTGCCG 
GAGCCGGAGGCGCGGGGACCTCTTGGCGCGACGCTGCCCCGCGAGGAGGCAGGACT 
GAC(XCAGACCGCCTCCCTTTGCCGCCGGGGACGCTTGCTCCCTCCCTGCCCCCTA 
GCGTCCCTCAGGCGCCCCCATTCCGGACCAGCCCTCGGGAGTCGCCGACCCGGCCT 
CAMGACTTTTCACCATACCTCGGGCGCACCCTCTGCACGCGGCCTTCATCACCGG 
TCTACTGAGCCCCCGCGGATGCCTAGACCCTTTCTCCTCCGGGAGACGGATCCCTC 
ACCTGCCGCAAATTCCCTATTCTGGAACACCCCCGCTTCCTGGGACCCTAATCCCC 
TTCGACGCTCCTTGCGCTGGGGAACTGAAGAGCCCCCGGGTTCGTAACCTtTTCCT 
CGTTTTGAAAMCATCCCC(XTTMTMACCTTGACTATTTTCGCTTTGGGCCCCC 
TACGGnTn^GGCXiGGCACTAAACAAACATCGAGTCTCAAGGCGGCGGATGCCACT 
CCTGAATACTTTTGCGCGTTAGGGGCGGTCTTTTACGCGAGTAGAGTCGGGCCTTG 
GACCCTATTCATTGGTTTCCCGTGACGTGTGCGGGCGTAACGAGATATTAACCTCT 
ACACATTGTCATAAMCACCACrrrQiACACGCCCTACTCCTGTTAATAGTCGCCC 
CCCGCGTGTAAAATTTCCCGCGCCAATGCCCTCCATTATTCCGCTCCATGAAAAAG 
TCGGCN 
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Quality: 



High quality sequence stops at base: 467 



Entry Created: Mar 11 2002 
Last Updated: Mar 12 2002 



COMMENTS 



an 



Tissue Procurement: DCTD/DTP 

cDNA Library Preparation: Rubin Laboratory 

cDNA Library Arrayed by: The LM.A.G.E. Consortium (LLNL 

DNA Sequencing by: Agencourt Bioscience Corporation 
Clone distribution: MGC clone distribution information c 

be found through *he •• L M. A..G. E. Consortium/LLNL at: 



LIBRARY 
Lib Name: 
Organism: 
Organ: 

Tissue type: 
Lab host: 
Vector : 
R. Site 1: 
R. Site 2: 
Description: 

C(G 



NIH_MGC_41 
Homo sapiens 

skin 

amelanotic melanoma, cell line 

DH10B (phage-resistant) 

pOTB7 

Xhol 

EcoRI 

cDNA made by oligo-dT priming. Directionally cloned into 
EcoRI/XhoI sites using the following 5' adaptor: GGCACGA 

). Library constructed by Ling Hong in the laboratory of 
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ng 



Gerald M. Rubin (University of California, Berkeley) usi 



ZAP-cDNA synthesis kit (Stratagene) and Superscript II R 



(Life Technologies). Note: this is a NIHMGC Library. 



SUBMITTER 
Name; 
E-mail : 



Robert Strausberg, PL D. 



CITATIONS 
Title: 

Authors : 

Year: 

Status: 



National Institutes of Health, Mammalian Gene Collection 
(MGC) 

/■ NIH-MGC Irftii^^ 
. 1999 
Unpublished 



Bottom of Form 
Kevised: October 24, 2001. 
Check on Est in Genbank: 
Query= (1086 letters) 

Database: All Gen Bank+fcML+DDB J + PDB sequences (but no EST, STS, 
GSS, or phase 0, 1 or 2 HTGS sequences) 

1, 205, 903 sequences; 5,297,758,116 total letters 



Score E 

Sequences producing significant alignments: 
Value 



(bits) 
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gj 1 10863872 1 ref I NM 000660. 1 1 Homo sapiens transforming grow. . . 5£ 
1 e-165 

gj 1 1 859009 1 1 ref I XM 085882. 1 1 Homo sapiens similar to transf. . . 587 
e-165 

gj 1 11424057 1 ref I XM 008912.11 Homo sapiens transforming grow. . . 587 
e-165 

gi 1 768438 llgbl ACQ 1 146 2,4 1 A C0 U462 Homo sapiens chromosome 1. . . 5 
81 e-165 

gi I 37Q97 1 e rob I X Q5839. 1 1 H ST GFBG1 Human transforming growth fa. . . 
587 e-165 

gi I 37092 1 emb I X02812. 1 1 HSTGFB 1 Human mRNA for transforming g. . . 
587 e-165 

gi 1 340526 I gb 1 J04481.1 1 HUMTGFB1PR Homo sapiens transforming ..- v 

587 e-165 : * 

gi 1 1 2654 68 2 1 gb I BC00118 Q . 1 1 B C0Q1 1 8 O Homo sapiens, Similar to. . . 2. 
9_1 8e-76 

gi 1 12652748 1 gb I BC00012S.1 1 BC0001 25 Homo sapiens, Similar to. . . 2 
£1 8e-76 

gi 1 1 8490 11 5 1 gb 1 BC022242. 1 I Homo sapiens, clone MGC: 22008 IM. . . 15 
3 4e-34 

gi I 755044 1 gb I M23703. 1 1 PIGTGFBl A Sus scrofa transforming gro. . . 
129 6e-27 

gi I 76 50477 1 gb 1 AF24 9327.1 1 AF249327 Rattus norvegicus TGF-bet. . . _ 
fi£ 8e-08 

gi 1 4416081 I gb I AF105069.1 1 AF 105069 Rattus norvegicus transfo. . . _ 
Sfi 8e-08 

gi 1 2394170 1 gb I AF015683.1 1 AF015683 Rattus norvegicus transfo. . . 
6£ 8e-08 
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gi 1 6755774 1 ref I NM 011577JLL Mus musculus transforming growt. . . 64 
3e-07 

gi 1 1161133 1 gb I L42456.ll MUSTttFI OOI Mus musculus TGF-1 gene, ... 
.64 3e-07 

gi 1 3688423 1 emh 1 AJ009S69. 1 I MMTT0n98fi? Mus musculus mRNA for t. . . 
64 3e-07 

gj I 2019471 gb I M57902. 1 1 MTTSTCtFB 1 Mouse transforming growth fa . . 
64 3e-07 

gj 1 18042365 \ gb 1 AC0974B3 3 1 Homo sapiens BAC clone RP11-146N. .. _A 
4 0. 30 

gi 1 17481821 1 ref I XM 008785.8 1 Homo sapiens one cut domain, f . . . _44 
0. 30 

gj 1 12737997 I ref I XM 007116.2 1 Homo sapiens Zic family member... _M 

0.30 v ••■ 

gj I 6005961 1 ref I NM 007129. T I Homo sapiens Zic family member'. ... _44 

0. 30 

gj 1 11065969 I gb I AF1938S5.1 1 AF1 93855 Homo sapiens zinc finger. . . 
44 0. 30 

gi 1 4758847 I ref I NM 004852. 1 I Homo sapiens one cut domain, f a. . . _4J 
0. 30 

gj 1 15787728 I emh I AL355338 S3 1 AT ,3553 38 Human DNA sequence fro. . . 
44 0.30 

gi 1 4028591 1 gh I AF104902.1 1 AFi 04902 Homo sapiens ZIC2 protein. . . _ 
44 0. 30 

gi I 1531593 1 gb IUS0523.1 1 HSU50523 Human BRCA2 region, mRNA se. . . _ 
44 0. 30 

gi I 4468940 1 emh I Y18198.1 1 HSAY181 98 Homo sapiens mRNA for ONE. . . 
44 0. 30 

gi 1 19067958 1 gh I AY049805. 1 I Alopias pelagicus 5. 8S rihosomal... _42 
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gil lEQZMBh I gh I AY 0 37 8 58.1 1 Cercopithicine herpesvirus 15 st. 
1.2 

gil 12A3324 8 1 g b 1 AC0 2 0659.5 1 A GQ2Qfi5a Homo sapiens chromosome 
42 1.2 

gi 1 19909461 1 gb I AC098709.3 i Mus musculus clone RP23-1K14, co. 
Q 4.6 

gil 19921137 I ref I NM 135651.11 Drosophila melanogaster (CG47. 
JQ 4.6 

gi I 1 8376846 I gh I AC092198.2 1 Homo sapiens chromosome X clone . 
Q 4.6 

gj 1 18467841 I ref 1 XM 078995.1 1 CG4751 (CG4751), mRNA 
4.6 

gi I 1 8376869 I gh I ACW91 898.2 1 Homo sapiens chromosome, 5 clone . 
Q 4.6 



Q 4.6 

gi I 1 5887302 I gh I AC02091 4.8 1 Homo sapiens chromosome 19 clone. . 
Q. 4.6 

gi I 14578122 1 gh I AC092241. 1 1 AC092241 Drosophila melanogaster, 
4Q 4.6 

gi 1 1 5292266 I gh I AYOB 1.978. 1 I Drosophila melanogaster LD44770 . . 
4.6 

gi I 15055218 I gh I AC060226.39 1 Homo sapiens 12 BAC RP11-101P14. . 
Q 4.6 

gi 1 14389338 I gh I AO084 282.fi I A(M84282 Oryza sativa chromosome 
H 4.6 

giM3fi771fi7lg hl ACOTS977.9IAC015977 Homo sapiens clone RP11- 
40 4.6 



Lgbl 



Homo sapiens chromosome 5 clone . . 
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gi!9910225lreflNM 020179.11 Homo sapiens FN5 protein (FN5) , . . . 40 
4.6 

gi 1 10440613 1 grb I AC069 145.5 f AC0601 45 Oryza sativa chromosome ... _ 
4Q 4.6 

gi 1 1 0728714 1 gh I AE003631 .2 1 AE003631 Drosophila melanogaster ... _ 
4Q 4.6 

giia24JS422LghlAElSIiaZJL^19..7.13.7 Homo sapiens FN5 protein ... _ 
4Q 4.6 

gi I 4190938 1 gb I AC000091.1 1 AC000091 Homo sapiens Chromosome 2. . . _ 
40 4.6 

gi 1 17431932 1 emb 1 AL646085.1 1 AT ,646085 Ralstonia solanacearum ... 
40. 4.6 

gi 1 1 5073719 1 emb I AL59 1785.1 1 SME5917S5 Sinorhizobium meliloti. . . 
ill 4.6 

gi 1 362 8578 1 g b I A C Q05 1 1g!li A C005115 Drosophila melanogaster D. . . _ 
4Q 4.6 

gi 1 31 5(1432 1 gb 1 11 50080 1 1 T,SIJ5008O Lymnaea stagnalis serotonin... 
40 4.6 

gi 1 8052359 1 smh I AL356592. 1 1 SC9H11 Streptomyces coelicolor co. . . _ 
4Q 4.6 

gi 1 6624640 1 nmh I AL034344.24 1 H5118B18 Human DNA sequence from. . . 
40 4.6 

gi 1 1 5528721 1 dhj I AP003296.3 1 Oryza sativa (japonica cult ivar. . . _4Q 
4.6 

gi 1 1 5289781 1 Hhj I AP003141.2 1 Oryza sativa (japonica cultivar. . . _4fi 
4.6 

gi 1 6069643 1 dh j I AP000616. 1 1 Oryza sativa (japonica cultivar-. . . _JQ 
4.6 

gi 1 960285 1 g b I L 46862.1 1 RATLAMB2G Rattus norvegicus laminin B. . . 
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40 4.6 

gi 1 198704 1 gh 1 J03749.1 1 MUSLAMB2B Mouse laminin B2 gene, exon. . . 
40 4.6 

gi 1 198702 1 gh I J02930. 1 1 MUSLAMB2A Mouse laminin B2 chain tnRNA. . . 
J 4.6 

gi 1 198694 1 gh 1 J03484.1 1 MUSLAM2B Mouse laminin B2 chain mRNA, . . . 
40 4.6 
Alignments 

> gi 1 10863872 1 ref I NM 000660.1 1 Homo sapiens transforming growth facto 
r, beta 1 (Camurati-Engelmann 

disease) (TGFBl) , mRNA 
Length = 2745 

Score = 587 bits ;(296), Expect '« e-165 • 
Identities = 356/377 '(94%), Gaps = 1/377 . (0%) 
Strand = Plus / Plus 

Query : 246 cgagctggtcgggagaagaggnnnnnnncttttgagacttttccgttgccgctgggagcc 
305 

lllllllllflllllllllll llllllllllllllllllllllllllllllll 
Sb jet : 225 cgagctggtcgggagaagaggaaaaaaacttttgagacttttccgttgccgctgggagcc 
284 

Query : 306 ggaggcgcggggacctcttggcgcgacgctgccccgcgaggaggcaggacttggggaccc 
365 




344 
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Query: 366 cagaccgcctccctttgccgccggggacgcttgctccctccctgccccctacacggcgtc 
425 

lillllMllllllllllllllllllllllilMIIIIMIIIlllllllllllllllll 

Sbjct : 345 cagaccgcctccctttgccgccggggacgcttgctccctccctgccccctacacggcgtc 
404 

Query : 426 cctcaggcgcccccattccggaccagccctcgggagtcgccgacccggcctctcgcaaag 
485 

llllllillllllllllllllllllllllllllMllllllllllllllli! MINN 

Sbjct : 405 cctcaggcgcccccattccggaccagccctcgggagtcgccgacccggcctcccgcaaag 
464 

Query : 486 acttttcaccatacctcgggcgcaccctctgcacgcggccttcatcaccggcctgtctac 

545 -r • • - 

IIIIHI 111 111111111111111 llllllll lllllilll llllllllll! I 
Sbjct : 465 acttttccccagacctcgggcgcaccccctgcacgccgccttcatccccggcctgtctcc 
524 

Query: 546 tgagcccccgcggatgcctagaccctttctcctccgggagacggatccctctccgacctg 
605 

minimum ii iiiiiiiiiiiiiiiiiii iiinmiii iiiiniiiiii 

Sbjct : 525 tgagcccccgcgcat-cctagaccctttctcctccaggagacggatctctctccgacctg 
583 

Query: 606 ccgcaaattccctattc 622 

ii ii ii iiiiini 

Sbjct: 584 ccacagatcccctattc 600 
[0 0 8 3] 
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1 cx^^-tf^ISi— ^^C^mRN Afrhftt>*Ltii5' J&tWffl* ^XteR— 

ocdna5^ r?v-*c>mmfri*. mmms^umLfz^^^ ncbi 

BLAST (h t t p : //www, ncbi. n 1 m. nih. g o v /B 
LAST/) Oi^^W5^V7h')x7V'Ja-y3>'(:J:oT^fU n£ 
-coiB?iJ*^<0|WI;££*T -7 £ t#ti*o £OJ:?fci*— OE^J*^<*>£T&& 
v^-eMS^KftlfcL, ffi— OSfcH-^S>x.& VT-£Tco*^OiitifcK:^t-&m<OP£ 
— co * 7<D^5-*frffi1r hZ.t iFX* § & o £T co * ^ Ojfe&fcStt & 9 7 
(Dm^li, ZrV&m-Xl* cDNA7^7'7'J -*4>«ft<Z>mRNA<Z>*E^»«>5£ 

[0 0 8 4] 

•ClSo iOiHMfi, NCBI BLAST ( h t t o : //www, n 
cbi nlm. nih. g o v/BLAST/ ) <7)i9^I$^V7 >7x 

7vja->3>»-c, 5' ^#^E^j*^£^Age^j&c#LT^j$ 

A <D^\z li, £SB« $ fitz r 7* - + K i o tiM^ <M# 

[0 0 8 5] 

mmm<omfe*vf&K't&o ffittfc&^x, sgfutos' ^j:«9^±^co 

DNAJi. Sit, at^H&mSravhn-^-r^^jcfflv^tL*; &3:£A^<0 
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Transcription Regulatory Region Database (TRRD) 



(http://wwwmgs. bionet. nsc. ru/mes/dbases/trrd4/) 

TRANSFAC ( http : //transf ac. gbf. de/TRANSFAC/) 

TFSEARCH (http://vmw.cbrc. ip/research/db/TFSEARCH. html) 

Promoter Inspector provide by Genomatix Software 

(http : //www. genomat i x. de/) 
[0 0 8 6] 



Afcim^&o.tfv*7~ifmm& (pcr) ic^it^*, £t^i$f»;& 

V* * y =f d T 7° 9 W v - Jfr <b # tt x. RNA^f) $ tL7tif a £ Jfi V> 

cDNA9^77'J -5&^#^<0^ c D N A *^<Si"* £ t &X § & 0 
[0 0 8 71 
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[0 0 8 8] 

^:T4RNA'J tf—tffcJ: *9> * ^^LfcmRNA^) 5 ' 

^Kf^l" * i #t § , d £ Ktim LtzXd K. 3S*£#<7> ? n ~ - > Steffi v> 
[0 0 8 9] 

[0 0 9 0] 
[0 0 9 1] 
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[0 0 9 2] 

y ^ D N A *<D -fu * - * -^J*OS^WlHliR^ & prfglC & & 0 
[0 0 9 3] 

J ^^f^no -Oi^j: NEJf&ROJ li, (1) IE¥»l*6»ffio^PfiE, ( 
i £ K J: «9 & £ § & « 2 a fcttWoaHS^SgoR 

[0 0 9 4] 
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